Adding upstream version 1.34.4.
Signed-off-by: Daniel Baumann <daniel@debian.org>
This commit is contained in:
parent
e393c3af3f
commit
4978089aab
4963 changed files with 677545 additions and 0 deletions
77
docs/specs/tsd-005-output-buffer-strategy.md
Normal file
77
docs/specs/tsd-005-output-buffer-strategy.md
Normal file
|
@ -0,0 +1,77 @@
|
|||
# Telegraf Output Buffer Strategy
|
||||
|
||||
## Objective
|
||||
|
||||
Introduce a new agent-level config option to choose a disk buffer strategy for
|
||||
output plugin metric queues.
|
||||
|
||||
## Overview
|
||||
|
||||
Currently, when a Telegraf output metric queue fills, either due to incoming
|
||||
metrics being too fast or various issues with writing to the output, oldest
|
||||
metrics are overwritten and never written to the output. This specification
|
||||
defines a set of options to make this output queue more durable by persisting
|
||||
pending metrics to disk rather than only an in-memory limited size queue.
|
||||
|
||||
## Keywords
|
||||
|
||||
output plugins, agent configuration, persist to disk
|
||||
|
||||
## Agent Configuration
|
||||
|
||||
The configuration is at the agent-level, with options for:
|
||||
|
||||
- **Memory**, the current implementation, with no persistence to disk
|
||||
- **Write-through**, all metrics are also written to disk using a
|
||||
Write Ahead Log (WAL) file
|
||||
- **Disk-overflow**, when the memory buffer fills, metrics are flushed to a
|
||||
WAL file to avoid dropping overflow metrics
|
||||
|
||||
As well as an option to specify a directory to store the WAL files on disk,
|
||||
with a default value. These configurations are global, and no change means
|
||||
memory only mode, retaining current behavior.
|
||||
|
||||
## Metric Ordering and Tracking
|
||||
|
||||
Tracking metrics will be accepted on a successful write to the output
|
||||
destination. Metrics will be written to their appropriate output in the order
|
||||
they are received in the buffer regardless of which buffer strategy is chosen.
|
||||
|
||||
## Disk Utilization and File Handling
|
||||
|
||||
Each output plugin has its own in-memory output buffer, and therefore will
|
||||
each have their own WAL file for buffer persistence. This file may not exist
|
||||
if Telegraf is successfully able to write all of its metrics without filling
|
||||
the in-memory buffer in disk-overflow mode, or not at all in memory mode.
|
||||
Telegraf should use one file per output plugin, and remove entries from the
|
||||
WAL file as they are written to the output.
|
||||
|
||||
Telegraf will not make any attempt to limit the size on disk taken by these
|
||||
files beyond cleaning up WAL files for metrics that have successfully been
|
||||
flushed to their output destination. It is the user's responsibility to ensure
|
||||
these files do not entirely fill the disk, both during Telegraf uptime and
|
||||
with lingering files from previous instances of the program.
|
||||
|
||||
If WAL files exist for an output plugin from previous instances of Telegraf,
|
||||
they will be picked up and flushed before any new metrics that are written
|
||||
to the output. This is to ensure that these metrics are not lost, and to
|
||||
ensure that output write order remains consistent.
|
||||
|
||||
Telegraf must additionally provide a way to manually flush WAL files via
|
||||
some separate plugin or similar. This could be used as a way to ensure that
|
||||
WAL files are properly written in the event that the output plugin changes
|
||||
and the WAL file is unable to be detected by a new instance of Telegraf.
|
||||
This plugin should not be required for use to allow the buffer strategy to
|
||||
work.
|
||||
|
||||
## Is/Is-not
|
||||
|
||||
- Is a way to prevent metrics from being dropped due to a full memory buffer
|
||||
- Is not a way to guarantee data safety in the event of a crash or system failure
|
||||
- Is not a way to manage file system allocation size, file space will be used
|
||||
until the disk is full
|
||||
|
||||
## Prior art
|
||||
|
||||
[Initial issue](https://github.com/influxdata/telegraf/issues/802)
|
||||
[Loose specification issue](https://github.com/influxdata/telegraf/issues/14805)
|
Loading…
Add table
Add a link
Reference in a new issue