5.7 KiB
Plugin State-Persistence
Objective
Retain the state of stateful plugins across restarts of Telegraf.
Keywords
framework, plugin, stateful, persistence
Overview
Telegraf contains a number of plugins that hold an internal state while processing. For some of the plugins this state is important for efficient processing like the location when reading a large file or when continuously querying data from a stateful peer requiring for example an offset or the last queried timestamp. For those plugins it is important to persistent their internal state over restarts of Telegraf.
It is intended to
- allow for opt-in of plugins to store a state per plugin instance
- restore the state for each plugin instances at startup
- track the plugin instances over restarts to relate the stored state with a corresponding plugin instance
- automatically compute plugin instance IDs based on the plugin configuration
- provide a way to manually specify instance IDs by the user
- not restore states if the plugin configuration changed between runs
- make implementation easy for plugin developers
- make no assumption on the state content
The persistence will use the following steps:
- Compute an unique ID for each of the plugin instances
- Startup Telegraf plugins calling
Init()
, etc. - Initialize persistence framework with the user specified
statefile
location and load the state if present - Determine all stateful plugin instances by fulfilling the
StatefulPlugin
interface - Restore plugin states (if any) for each plugin ID present in the state-file
- Run data-collection etc...
- On shutdown, stopping all Telegraf plugins calling
Stop()
orClose()
depending on the plugin type - Query the state of all registered stateful plugins state
- Create an overall state-map with the plugin instance ID as a key and the serialized plugin state as value.
- Marshal the overall state-map and store to disk
Potential users of this functionality are plugins continuously querying endpoints with information of a previous query (e.g. timestamps, offsets, transaction tokens, etc.) The following plugins are known to have an internal state. This is not a comprehensive list.
inputs.win_eventlog
(PR #8281)inputs.docker_log
(PR #7749)inputs.tail
(file offset)inputs.cloudwatch
(windowStart
/windowEnd
parameters)inputs.stackdriver
(prevEnd
parameter)
Plugin ID computation
The plugin ID is computed based on the configuration options specified for the
plugin instance. To generate the ID all settings are extracted as string
key-value pairs with the option name being the key and the value being the
configuration option setting. For nested configuration options, e.g. if the
plugins has a sub-table, the options are flattened with a canonical key. The
canonical key elements must be concatenated with a dot (.
) separator. In case
the sub-element is a list of tables, the key must include the index of each
table prefixed by a hash sign i.e. <parent>#<index>.<child>
.
The resulting key-value pairs of configuration options are then sorted by the
key in lexical order to make the resulting ID invariant against changes in the
order of configuration options. The key and the value of each pair are joined
by a colon (:
) to a single string
.
Finally, a SHA256 sum is computed across all key-value strings separated by a
null
byte. The HEX representation of the resulting SHA256 is used as the
plugin instance ID.
State serialization format
The overall Telegraf state maps the plugin IDs (keys) to the serialized state of the corresponding plugin (values). The state data returned by stateful plugins is serialized to JSON. The resulting byte-sequence is used as the value for the overall state. On-disk, the overall state of Telegraf is stored as JSON.
To restore the state of a plugin, the overall Telegraf state is first
deserialized from the on-disk JSON data and a lookup for the plugin ID is
performed in the resulting map. The value, if found, is then deserialized to the
plugin's state data-structure and provided to the plugin after calling Init()
.
Is / Is-not
Is
- A framework to persist states over restarts of Telegraf
- A simple local state store
- A way to restore plugin states between restarts without configuration changes
- A unified API for plugins to use when requiring persistence of a state
Is-Not
- A remote storage framework
- A way to store anything beyond fundamental plugin states
- A data-store or database
- A way to reassign plugin states if their configuration changes
- A tool to interactively adding/removing/modifying states of plugins
- A persistence guarantee beyond clean shutdown (i.e. no crash resistance)
Prior art
- PR #8281: Stores Windows event-log bookmarks in the registry
- PR #7749: Stores container ID and log offset to a file at a user-provided path
- PR #7537: Provides a global state object and periodically queries plugin states to store the state object to a JSON file. This approach does not provide a ID per plugin instance so it seems like there is only a single state for a plugin type
- PR #9476: Register stateful plugins to persister and automatically assigns an ID to plugin instances based on the configuration. The approach also allows to overwrite the automatic ID e.g. with user specified data. It uses the plugin instance ID to store/restore state to the same plugin instance and queries the plugin state on shutdown and write file (currently JSON).