Adding upstream version 1.34.4.
Signed-off-by: Daniel Baumann <daniel@debian.org>
This commit is contained in:
parent
e393c3af3f
commit
4978089aab
4963 changed files with 677545 additions and 0 deletions
8
docs/developers/CODE_STYLE.md
Normal file
8
docs/developers/CODE_STYLE.md
Normal file
|
@ -0,0 +1,8 @@
|
|||
# Code Style
|
||||
|
||||
Code is required to be formatted using `gofmt`, this covers most code style
|
||||
requirements. It is also highly recommended to use `goimports` to
|
||||
automatically order imports.
|
||||
|
||||
Please try to keep lines length under 80 characters, the exact number of
|
||||
characters is not strict but it generally helps with readability.
|
84
docs/developers/DEBUG.md
Normal file
84
docs/developers/DEBUG.md
Normal file
|
@ -0,0 +1,84 @@
|
|||
# Debug
|
||||
|
||||
The following describes how to use the [delve][1] debugger with telegraf
|
||||
during development. Delve has many, very well documented [subcommands][2] and
|
||||
options.
|
||||
|
||||
[1]: https://github.com/go-delve/delve
|
||||
[2]: https://github.com/go-delve/delve/blob/master/Documentation/usage/README.md
|
||||
|
||||
## CLI
|
||||
|
||||
To run telegraf manually, users can run:
|
||||
|
||||
```bash
|
||||
go run ./cmd/telegraf --config config.toml
|
||||
```
|
||||
|
||||
To attach delve with a similar config users can run the following. Note the
|
||||
additional `--` to specify flags passed to telegraf. Additional flags need to
|
||||
go after this double dash:
|
||||
|
||||
```bash
|
||||
$ dlv debug ./cmd/telegraf -- --config config.toml
|
||||
Type 'help' for list of commands.
|
||||
(dlv)
|
||||
```
|
||||
|
||||
At this point a user could set breakpoints and continue execution.
|
||||
|
||||
## Visual Studio Code
|
||||
|
||||
Visual Studio Code's [go language extension][20] includes the ability to easily
|
||||
make use of [delve for debugging][21]. Check out this [full tutorial][22] from
|
||||
the go extension's wiki.
|
||||
|
||||
A basic config is all that is required along with additional arguments to tell
|
||||
Telegraf where the config is located:
|
||||
|
||||
```json
|
||||
{
|
||||
// Use IntelliSense to learn about possible attributes.
|
||||
// Hover to view descriptions of existing attributes.
|
||||
// For more information, visit: https://go.microsoft.com/fwlink/?linkid=830387
|
||||
"version": "0.2.0",
|
||||
"configurations": [
|
||||
{
|
||||
"name": "Launch Package",
|
||||
"type": "go",
|
||||
"request": "launch",
|
||||
"mode": "auto",
|
||||
"program": "${fileDirname}",
|
||||
"args": ["--config", "/path/to/config"]
|
||||
}
|
||||
]
|
||||
}
|
||||
```
|
||||
|
||||
[20]: https://code.visualstudio.com/docs/languages/go
|
||||
[21]: https://code.visualstudio.com/docs/languages/go#_debugging
|
||||
[22]: https://github.com/golang/vscode-go/wiki/debugging
|
||||
|
||||
## GoLand
|
||||
|
||||
JetBrains' [GoLand][30] also includes full featured [debugging][31] options.
|
||||
|
||||
The following is an example debug config to run Telegraf with a config:
|
||||
|
||||
```xml
|
||||
<component name="ProjectRunConfigurationManager">
|
||||
<configuration default="false" name="build & run" type="GoApplicationRunConfiguration" factoryName="Go Application">
|
||||
<module name="telegraf" />
|
||||
<working_directory value="$PROJECT_DIR$" />
|
||||
<parameters value="--config telegraf.conf" />
|
||||
<kind value="DIRECTORY" />
|
||||
<package value="github.com/influxdata/telegraf" />
|
||||
<directory value="$PROJECT_DIR$/cmd/telegraf" />
|
||||
<filePath value="$PROJECT_DIR$" />
|
||||
<method v="2" />
|
||||
</configuration>
|
||||
</component>
|
||||
```
|
||||
|
||||
[30]: https://www.jetbrains.com/go/
|
||||
[31]: https://www.jetbrains.com/help/go/debugging-code.html
|
93
docs/developers/DEPRECATION.md
Normal file
93
docs/developers/DEPRECATION.md
Normal file
|
@ -0,0 +1,93 @@
|
|||
# Deprecation
|
||||
|
||||
Deprecation is the primary tool for making changes in Telegraf. A deprecation
|
||||
indicates that the community should move away from using a feature, and
|
||||
documents that the feature will be removed in the next major update (2.0).
|
||||
|
||||
Key to deprecation is that the feature remains in Telegraf and the behavior is
|
||||
not changed.
|
||||
|
||||
We do not have a strict definition of a breaking change. All code changes
|
||||
change behavior, the decision to deprecate or make the change immediately is
|
||||
decided based on the impact.
|
||||
|
||||
## Deprecate plugins
|
||||
|
||||
Add an entry to the plugins deprecation list (e.g. in `plugins/inputs/deprecations.go`). Include the deprecation version
|
||||
and any replacement, e.g.
|
||||
|
||||
```golang
|
||||
"logparser": {
|
||||
Since: "1.15.0",
|
||||
Notice: "use 'inputs.tail' with 'grok' data format instead",
|
||||
},
|
||||
```
|
||||
|
||||
The entry can contain an optional `RemovalIn` field specifying the planned version for removal of the plugin.
|
||||
|
||||
Also add the deprecation warning to the plugin's README:
|
||||
|
||||
```markdown
|
||||
# Logparser Input Plugin
|
||||
|
||||
### **Deprecated in 1.10**: Please use the [tail][] plugin along with the
|
||||
`grok` [data format][].
|
||||
|
||||
[tail]: /plugins/inputs/tail/README.md
|
||||
[data formats]: /docs/DATA_FORMATS_INPUT.md
|
||||
```
|
||||
|
||||
Telegraf will automatically check if a deprecated plugin is configured and print a warning
|
||||
|
||||
```text
|
||||
2022-01-26T20:08:15Z W! DeprecationWarning: Plugin "inputs.logparser" deprecated since version 1.15.0 and will be removed in 2.0.0: use 'inputs.tail' with 'grok' data format instead
|
||||
```
|
||||
|
||||
## Deprecate options
|
||||
|
||||
Mark the option as deprecated in the sample config, include the deprecation
|
||||
version and any replacement.
|
||||
|
||||
```toml
|
||||
## Broker to publish to.
|
||||
## deprecated in 1.7; use the brokers option
|
||||
# url = "amqp://localhost:5672/influxdb"
|
||||
```
|
||||
|
||||
In the plugins configuration struct, add a `deprecated` tag to the option:
|
||||
|
||||
```go
|
||||
type AMQP struct {
|
||||
URL string `toml:"url" deprecated:"1.7.0;use 'brokers' instead"`
|
||||
Precision string `toml:"precision" deprecated:"1.2.0;option is ignored"`
|
||||
}
|
||||
```
|
||||
|
||||
The `deprecated` tag has the format `<since version>[;removal version];<notice>` where the `removal version` is optional. The specified deprecation info will automatically displayed by Telegraf if the option is used in the config
|
||||
|
||||
```text
|
||||
2022-01-26T20:08:15Z W! DeprecationWarning: Option "url" of plugin "outputs.amqp" deprecated since version 1.7.0 and will be removed in 2.0.0: use 'brokers' instead
|
||||
```
|
||||
|
||||
### Option value
|
||||
|
||||
In the case a specific option value is being deprecated, the method `models.PrintOptionValueDeprecationNotice` needs to be called in the plugin's `Init` method.
|
||||
|
||||
## Deprecate metrics
|
||||
|
||||
In the README document the metric as deprecated. If there is a replacement field,
|
||||
tag, or measurement then mention it.
|
||||
|
||||
```markdown
|
||||
- system
|
||||
- fields:
|
||||
- uptime_format (string, deprecated in 1.10: use `uptime` field)
|
||||
```
|
||||
|
||||
Add filtering to the sample config, leave it commented out.
|
||||
|
||||
```toml
|
||||
[[inputs.system]]
|
||||
## Uncomment to remove deprecated metrics.
|
||||
# fieldexclude = ["uptime_format"]
|
||||
```
|
79
docs/developers/LOGGING.md
Normal file
79
docs/developers/LOGGING.md
Normal file
|
@ -0,0 +1,79 @@
|
|||
# Logging
|
||||
|
||||
## Plugin Logging
|
||||
|
||||
You can access the Logger for a plugin by defining a field named `Log`. This
|
||||
`Logger` is configured internally with the plugin name and alias so they do not
|
||||
need to be specified for each log call.
|
||||
|
||||
```go
|
||||
type MyPlugin struct {
|
||||
Log telegraf.Logger `toml:"-"`
|
||||
}
|
||||
```
|
||||
|
||||
You can then use this Logger in the plugin. Use the method corresponding to
|
||||
the log level of the message.
|
||||
|
||||
```go
|
||||
p.Log.Errorf("Unable to write to file: %v", err)
|
||||
```
|
||||
|
||||
## Agent Logging
|
||||
|
||||
In other sections of the code it is required to add the log level and module
|
||||
manually:
|
||||
|
||||
```go
|
||||
log.Printf("E! [agent] Error writing to %s: %v", output.LogName(), err)
|
||||
```
|
||||
|
||||
## When to Log
|
||||
|
||||
Log a message if an error occurs but the plugin can continue working. For
|
||||
example if the plugin handles several servers and only one of them has a fatal
|
||||
error, it can be logged as an error.
|
||||
|
||||
Use logging judiciously for debug purposes. Since Telegraf does not currently
|
||||
support setting the log level on a per module basis, it is especially important
|
||||
to not over do it with debug logging.
|
||||
|
||||
If the plugin is listening on a socket, log a message with the address of the socket:
|
||||
|
||||
```go
|
||||
p.log.InfoF("Listening on %s://%s", protocol, l.Addr())
|
||||
```
|
||||
|
||||
## When not to Log
|
||||
|
||||
Don't use logging to emit performance data or other meta data about the plugin,
|
||||
instead use the `internal` plugin and the `selfstats` package.
|
||||
|
||||
Don't log fatal errors in the plugin that require the plugin to return, instead
|
||||
return them from the function and Telegraf will handle the logging.
|
||||
|
||||
Don't log for static configuration errors, check for them in a plugin `Init()`
|
||||
function and return an error there.
|
||||
|
||||
Don't log a warning every time a plugin is called for situations that are
|
||||
normal on some systems.
|
||||
|
||||
## Log Level
|
||||
|
||||
The log level is indicated by a single character at the start of the log
|
||||
message. Adding this prefix is not required when using the Plugin Logger.
|
||||
|
||||
- `D!` Debug
|
||||
- `I!` Info
|
||||
- `W!` Warning
|
||||
- `E!` Error
|
||||
|
||||
## Style
|
||||
|
||||
Log messages should be capitalized and be a single line.
|
||||
|
||||
If it includes data received from another system or process, such as the text
|
||||
of an error message, the text should be quoted with `%q`.
|
||||
|
||||
Use the `%v` format for the Go error type instead of `%s` to ensure a nil error
|
||||
is printed.
|
49
docs/developers/METRIC_FORMAT_CHANGES.md
Normal file
49
docs/developers/METRIC_FORMAT_CHANGES.md
Normal file
|
@ -0,0 +1,49 @@
|
|||
# Metric Format Changes
|
||||
|
||||
When making changes to an existing input plugin, care must be taken not to change the metric format in ways that will cause trouble for existing users. This document helps developers understand how to make metric format changes safely.
|
||||
|
||||
## Changes can cause incompatibilities
|
||||
|
||||
If the metric format changes, data collected in the new format can be incompatible with data in the old format. Database queries designed around the old format may not work with the new format. This can cause application failures.
|
||||
|
||||
Some metric format changes don't cause incompatibilities. Also, some unsafe changes are necessary. How do you know what changes are safe and what to do if your change isn't safe?
|
||||
|
||||
## Guidelines
|
||||
|
||||
The main guideline is just to keep compatibility in mind when making changes. Often developers are focused on making a change that fixes their particular problem and they forget that many people use the existing code and will upgrade. When you're coding, keep existing users and applications in mind.
|
||||
|
||||
### Renaming, removing, reusing
|
||||
|
||||
Database queries refer to the metric and its tags and fields by name. Any Telegraf code change that changes those names has the potential to break an existing query. Similarly, removing tags or fields can break queries.
|
||||
|
||||
Changing the meaning of an existing tag value or field value or reusing an existing one in a new way isn't safe. Although queries that use these tags/field may not break, they will not work as they did before the change.
|
||||
|
||||
Adding a field doesn't break existing queries. Queries that select all fields and/or tags (like "select * from") will return an extra series but this is often useful.
|
||||
|
||||
### Performance and storage
|
||||
|
||||
Time series databases can store large amounts of data but many of them don't perform well on high cardinality data. If a metric format change includes a new tag that holds high cardinality data, database performance could be reduced enough to cause existing applications not to work as they previously did. Metric format changes that dramatically increase the number of tags or fields of a metric can increase database storage requirements unexpectedly. Both of these types of changes are unsafe.
|
||||
|
||||
### Make unsafe changes opt-in
|
||||
|
||||
If your change has the potential to seriously affect existing users, the change must be opt-in. To do this, add a plugin configuration setting that lets the user select the metric format. Make the setting's default value select the old metric format. When new users add the plugin they can choose the new format and get its benefits. When existing users upgrade, their config files won't have the new setting so the default will ensure that there is no change.
|
||||
|
||||
When adding a setting, avoid using a boolean and consider instead a string or int for future flexibility. A boolean can only handle two formats but a string can handle many. For example, compare use_new_format=true and features=["enable_foo_fields"]; the latter is much easier to extend and still very descriptive.
|
||||
|
||||
If you want to encourage existing users to use the new format you can log a warning once on startup when the old format is selected. The warning should tell users in a gentle way that they can upgrade to a better metric format. If it doesn't make sense to maintain multiple metric formats forever, you can change the default on a major release or even remove the old format completely. See [[Deprecation]] for details.
|
||||
|
||||
### Utility
|
||||
|
||||
Changes should be useful to many or most users. A change that is only useful for a small number of users may not accepted, even if it's off by default.
|
||||
|
||||
## Summary table
|
||||
|
||||
| | delete | rename | add |
|
||||
| ------- | ------ | ------ | --- |
|
||||
| metric | unsafe | unsafe | safe |
|
||||
| tag | unsafe | unsafe | be careful with cardinality |
|
||||
| field | unsafe | unsafe | ok as long as it's useful for existing users and is worth the added space |
|
||||
|
||||
## References
|
||||
|
||||
InfluxDB Documentation: "Schema and data layout"
|
70
docs/developers/PACKAGING.md
Normal file
70
docs/developers/PACKAGING.md
Normal file
|
@ -0,0 +1,70 @@
|
|||
# Packaging
|
||||
|
||||
Building the packages for Telegraf is automated using [Make](https://en.wikipedia.org/wiki/Make_(software)). Just running `make` will build a Telegraf binary for the operating system and architecture you are using (if it is supported). If you need to build a different package then you can run `make package` which will build all the supported packages. You will most likely only want a subset, you can define a subset of packages to be built by overriding the `include_packages` variable like so `make package include_packages="amd64.deb"`. You can also build all packages for a specific architecture like so `make package include_packages="$(make amd64)"`.
|
||||
|
||||
The packaging steps require certain tools to be setup before hand to work. These dependencies are listed in the ci.docker file which you can find in the scripts directory. Therefore it is recommended to use Docker to build the artifacts, see more details below.
|
||||
|
||||
## Go Version
|
||||
|
||||
Telegraf will be built using the latest version of Go whenever possible.
|
||||
|
||||
### Update CI image
|
||||
|
||||
Incrementing the version is maintained by the core Telegraf team because it requires access to an internal docker repository that hosts the docker CI images. When a new version is released, the following process is followed:
|
||||
|
||||
1. Within the `Makefile`, `.circleci\config.yml`, and `scripts/ci.docker` files
|
||||
update the Go versions to the new version number
|
||||
2. Run `make ci`, this requires quay.io internal permissions
|
||||
3. The files `scripts\installgo_linux.sh`, `scripts\installgo_mac.sh`, and
|
||||
`scripts\installgo_windows.sh` need to be updated as well with the new Go
|
||||
version and SHA
|
||||
4. Create a pull request with these new changes, and verify the CI passes and
|
||||
uses the new docker image
|
||||
|
||||
See the [previous PRs](https://github.com/influxdata/telegraf/search?q=chore+update+go&type=commits) as examples.
|
||||
|
||||
### Access to quay.io
|
||||
|
||||
A member of the team needs to invite you to the quay.io organization.
|
||||
To push new images, the user needs to do the following:
|
||||
|
||||
1. Create a password if the user logged in using Google authentication
|
||||
2. Download an encrypted username/password from the quay.io user page
|
||||
3. Run `docker login quay.io` and enter in the encrypted username and password
|
||||
from the previous step
|
||||
|
||||
## Package using Docker
|
||||
|
||||
This packaging method uses the CI images, and is very similar to how the
|
||||
official packages are created on release. This is the recommended method for
|
||||
building the rpm/deb as it is less system dependent.
|
||||
|
||||
Pull the CI images from quay, the version corresponds to the version of Go
|
||||
that is used to build the binary:
|
||||
|
||||
```shell
|
||||
docker pull quay.io/influxdb/telegraf-ci:1.9.7
|
||||
```
|
||||
|
||||
Start a shell in the container:
|
||||
|
||||
```shell
|
||||
docker run -ti quay.io/influxdb/telegraf-ci:1.9.7 /bin/bash
|
||||
```
|
||||
|
||||
From within the container:
|
||||
|
||||
1. `go get -d github.com/influxdata/telegraf`
|
||||
2. `cd /go/src/github.com/influxdata/telegraf`
|
||||
3. `git checkout release-1.10`
|
||||
* Replace tag `release-1.10` with the version of Telegraf you would like to build
|
||||
4. `git reset --hard 1.10.2`
|
||||
5. `make deps`
|
||||
6. `make package include_packages="amd64.deb"`
|
||||
* Change `include_packages` to change what package you want, run `make help` to see possible values
|
||||
|
||||
From the host system, copy the build artifacts out of the container:
|
||||
|
||||
```shell
|
||||
docker cp romantic_ptolemy:/go/src/github.com/influxdata/telegraf/build/telegraf-1.10.2-1.x86_64.rpm .
|
||||
```
|
66
docs/developers/PROFILING.md
Normal file
66
docs/developers/PROFILING.md
Normal file
|
@ -0,0 +1,66 @@
|
|||
# Profiling
|
||||
|
||||
This article describes how to collect performance traces and memory profiles
|
||||
from Telegraf. If you are submitting this for an issue, please include the
|
||||
version.txt generated below.
|
||||
|
||||
Use the `--pprof-addr` option to enable the profiler, the easiest way to do
|
||||
this may be to add this line to `/etc/default/telegraf`:
|
||||
|
||||
```shell
|
||||
TELEGRAF_OPTS="--pprof-addr localhost:6060"
|
||||
```
|
||||
|
||||
Restart Telegraf to activate the profile address.
|
||||
|
||||
## Trace Profile
|
||||
|
||||
Collect a trace during the time where the performance issue is occurring. This
|
||||
example collects a 10 second trace and runs for 10 seconds:
|
||||
|
||||
```shell
|
||||
curl 'http://localhost:6060/debug/pprof/trace?seconds=10' > trace.bin
|
||||
telegraf --version > version.txt
|
||||
go env GOOS GOARCH >> version.txt
|
||||
```
|
||||
|
||||
The `trace.bin` and `version.txt` files can be sent in for analysis or, if desired, you can
|
||||
analyze the trace with:
|
||||
|
||||
```shell
|
||||
go tool trace trace.bin
|
||||
```
|
||||
|
||||
## Memory Profile
|
||||
|
||||
Collect a heap memory profile:
|
||||
|
||||
```shell
|
||||
curl 'http://localhost:6060/debug/pprof/heap' > mem.prof
|
||||
telegraf --version > version.txt
|
||||
go env GOOS GOARCH >> version.txt
|
||||
```
|
||||
|
||||
Analyze:
|
||||
|
||||
```shell
|
||||
$ go tool pprof mem.prof
|
||||
(pprof) top5
|
||||
```
|
||||
|
||||
## CPU Profile
|
||||
|
||||
Collect a 30s CPU profile:
|
||||
|
||||
```shell
|
||||
curl 'http://localhost:6060/debug/pprof/profile' > cpu.prof
|
||||
telegraf --version > version.txt
|
||||
go env GOOS GOARCH >> version.txt
|
||||
```
|
||||
|
||||
Analyze:
|
||||
|
||||
```shell
|
||||
go tool pprof cpu.prof
|
||||
(pprof) top5
|
||||
```
|
1
docs/developers/README.md
Symbolic link
1
docs/developers/README.md
Symbolic link
|
@ -0,0 +1 @@
|
|||
../../CONTRIBUTING.md
|
185
docs/developers/REVIEWS.md
Normal file
185
docs/developers/REVIEWS.md
Normal file
|
@ -0,0 +1,185 @@
|
|||
# Reviews
|
||||
|
||||
Pull-requests require two approvals before being merged. Expect several rounds of back and forth on
|
||||
reviews, non-trivial changes are rarely accepted on the first pass. It might take some time
|
||||
until you see a first review so please be patient.
|
||||
|
||||
All pull requests should follow the style and best practices in the
|
||||
[CONTRIBUTING.md](https://github.com/influxdata/telegraf/blob/master/CONTRIBUTING.md)
|
||||
document.
|
||||
|
||||
## Process
|
||||
|
||||
The review process is roughly structured as follows:
|
||||
|
||||
1. Submit a pull request.
|
||||
Please check that you signed the [CLA](https://www.influxdata.com/legal/cla/) (and [Corporate CLA](https://www.influxdata.com/legal/ccla/) if you are contributing code on as an employee of your company). Provide a short description of your submission and reference issues that you potentially close. Make sure the CI tests are all green and there are no linter-issues.
|
||||
1. Get feedback from a first reviewer and a `ready for final review` tag.
|
||||
Please constructively work with the reviewer to get your code into a mergeable state (see also [below](#reviewing-plugin-code)).
|
||||
1. Get a final review by one of the InfluxData maintainers.
|
||||
Please fix any issue raised.
|
||||
1. Wait for the pull-request to be merged.
|
||||
It might take some time until your PR gets merged, depending on the release cycle and the type of
|
||||
your pull-request (bugfix, enhancement of existing code, new plugin, etc). Remember, it might be necessary to rebase your code before merge to resolve conflicts.
|
||||
|
||||
Please read the review comments carefully, fix the related part of the code and/or respond in case there is anything unclear. Maintainers will add the `waiting for response` tag to PRs to make it clear we are waiting on the submitter for updates. __Once the tag is added, if there is no activity on a pull request or the contributor does not respond, our bot will automatically close the PR after two weeks!__ If you expect a longer period of inactivity or you want to abandon a pull request, please let us know.
|
||||
|
||||
In case you still want to continue with the PR, feel free to reopen it.
|
||||
|
||||
## Reviewing Plugin Code
|
||||
|
||||
- Avoid variables scoped to the package. Everything should be scoped to the plugin struct, since multiple instances of the same plugin are allowed and package-level variables will cause race conditions.
|
||||
- SampleConfig must match the readme, but not include the plugin name.
|
||||
- structs should include toml tags for fields that are expected to be editable from the config. eg `toml:"command"` (snake_case)
|
||||
- plugins that want to log should declare the Telegraf logger, not use the log package. eg:
|
||||
|
||||
```Go
|
||||
Log telegraf.Logger `toml:"-"`
|
||||
```
|
||||
|
||||
(in tests, you can do `myPlugin.Log = testutil.Logger{}`)
|
||||
|
||||
- Initialization and config checking should be done on the `Init() error` function, not in the Connect, Gather, or Start functions.
|
||||
- `Init() error` should not contain connections to external services. If anything fails in Init, Telegraf will consider it a configuration error and refuse to start.
|
||||
- plugins should avoid synchronization code if they are not starting goroutines. Plugin functions are never called in parallel.
|
||||
- avoid goroutines when you don't need them and removing them would simplify the code
|
||||
- errors should almost always be checked.
|
||||
- avoid boolean fields when a string or enumerated type would be better for future extension. Lots of boolean fields also make the code difficult to maintain.
|
||||
- use config.Duration instead of internal.Duration
|
||||
- compose tls.ClientConfig as opposed to specifying all the TLS fields manually
|
||||
- http.Client should be declared once on `Init() error` and reused, (or better yet, on the package if there's no client-specific configuration). http.Client has built-in concurrency protection and reuses connections transparently when possible.
|
||||
- avoid doing network calls in loops where possible, as this has a large performance cost. This isn't always possible to avoid.
|
||||
- when processing batches of records with multiple network requests (some outputs that need to partition writes do this), return an error when you want the whole batch to be retried, log the error when you want the batch to continue without the record
|
||||
- consider using the StreamingProcessor interface instead of the (legacy) Processor interface
|
||||
- avoid network calls in processors when at all possible. If it's necessary, it's possible, but complicated (see processor.reversedns).
|
||||
- avoid dependencies when:
|
||||
- they require cgo
|
||||
- they pull in massive projects instead of small libraries
|
||||
- they could be replaced by a simple http call
|
||||
- they seem unnecessary, superfluous, or gratuitous
|
||||
- consider adding build tags if plugins have OS-specific considerations
|
||||
- use the right logger log levels so that Telegraf is normally quiet eg `plugin.Log.Debugf()` only shows up when running Telegraf with `--debug`
|
||||
- consistent field types: dynamically setting the type of a field should be strongly avoided as it causes problems that are difficult to solve later, made worse by having to worry about backwards compatibility in future changes. For example, if an numeric value comes from a string field and it is not clear if the field can sometimes be a float, the author should pick either a float or an int, and parse that field consistently every time. Better to sometimes truncate a float, or to always store ints as floats, rather than changing the field type, which causes downstream problems with output databases.
|
||||
- backwards compatibility: We work hard not to break existing configurations during new changes. Upgrading Telegraf should be a seamless transition. Possible tools to make this transition smooth are:
|
||||
- enumerable type fields that allow you to customize behavior (avoid boolean feature flags)
|
||||
- version fields that can be used to opt in to newer changed behavior without breaking old (see inputs.mysql for example)
|
||||
- a new version of the plugin if it has changed significantly (eg outputs.influxdb and outputs.influxdb_v2)
|
||||
- Logger and README deprecation warnings
|
||||
- changing the default value of a field can be okay, but will affect users who have not specified the field and should be approached cautiously.
|
||||
- The general rule here is "don't surprise me": users should not be caught off-guard by unexpected or breaking changes.
|
||||
|
||||
## Linting
|
||||
|
||||
Each pull request will have the appropriate linters checking the files for any common mistakes. The github action Super Linter is used: [super-linter](https://github.com/github/super-linter). If it is failing you can click on the action and read the logs to figure out the issue. You can also run the github action locally by following these instructions: [run-linter-locally.md](https://github.com/github/super-linter/blob/main/docs/run-linter-locally.md). You can find more information on each of the linters in the super linter readme.
|
||||
|
||||
## Testing
|
||||
|
||||
Sufficient unit tests must be created. New plugins must always contain
|
||||
some unit tests. Bug fixes and enhancements should include new tests, but
|
||||
they can be allowed if the reviewer thinks it would not be worth the effort.
|
||||
|
||||
[Table Driven Tests](https://github.com/golang/go/wiki/TableDrivenTests) are
|
||||
encouraged to reduce boiler plate in unit tests.
|
||||
|
||||
The [stretchr/testify](https://github.com/stretchr/testify) library should be
|
||||
used for assertions within the tests when possible, with preference towards
|
||||
github.com/stretchr/testify/require.
|
||||
|
||||
Primarily use the require package to avoid cascading errors:
|
||||
|
||||
```go
|
||||
assert.Equal(t, lhs, rhs) # avoid
|
||||
require.Equal(t, lhs, rhs) # good
|
||||
```
|
||||
|
||||
## Configuration
|
||||
|
||||
The config file is the primary interface and should be carefully scrutinized.
|
||||
|
||||
Ensure the [[SampleConfig]] and
|
||||
[README](https://github.com/influxdata/telegraf/blob/master/plugins/inputs/EXAMPLE_README.md)
|
||||
match with the current standards.
|
||||
|
||||
READMEs should:
|
||||
|
||||
- be spaces, not tabs
|
||||
- be indented consistently, matching other READMEs
|
||||
- have two `#` for comments
|
||||
- have one `#` for defaults, which should always match the default value of the plugin
|
||||
- include all appropriate types as a list for enumerable field types
|
||||
- include a useful example, avoiding "example", "test", etc.
|
||||
- include tips for any common problems
|
||||
- include example output from the plugin, if input/processor/aggregator/parser/serializer
|
||||
|
||||
## Metric Schema
|
||||
|
||||
Telegraf metrics are heavily based on InfluxDB points, but have some
|
||||
extensions to support other outputs and metadata.
|
||||
|
||||
New metrics must follow the recommended
|
||||
[schema design](https://docs.influxdata.com/influxdb/latest/concepts/schema_and_data_layout/).
|
||||
Each metric should be evaluated for _series cardinality_, proper use of tags vs
|
||||
fields, and should use existing patterns for encoding metrics.
|
||||
|
||||
Metrics use `snake_case` naming style.
|
||||
|
||||
### Enumerations
|
||||
|
||||
Generally enumeration data should be encoded as a tag. In some cases it may
|
||||
be desirable to also include the data as an integer field:
|
||||
|
||||
```shell
|
||||
net_response,result=success result_code=0i
|
||||
```
|
||||
|
||||
### Histograms
|
||||
|
||||
Use tags for each range with the `le` tag, and `+Inf` for the values out of
|
||||
range. This format is inspired by the Prometheus project:
|
||||
|
||||
```shell
|
||||
cpu,le=0.0 usage_idle_bucket=0i 1486998330000000000
|
||||
cpu,le=50.0 usage_idle_bucket=2i 1486998330000000000
|
||||
cpu,le=100.0 usage_idle_bucket=2i 1486998330000000000
|
||||
cpu,le=+Inf usage_idle_bucket=2i 1486998330000000000
|
||||
```
|
||||
|
||||
### Lists
|
||||
|
||||
Lists are tricky, but the general technique is to encode using a tag, creating
|
||||
one series be item in the list.
|
||||
|
||||
### Counters
|
||||
|
||||
Counters retrieved from other projects often are in one of two styles,
|
||||
monotonically increasing without reset and reset on each interval. No attempt
|
||||
should be made to switch between these two styles but if given the option it
|
||||
is preferred to use the non-resetting variant. This style is more resilient in
|
||||
the face of downtime and does not contain a fixed time element.
|
||||
|
||||
### Source tag
|
||||
|
||||
When metrics are gathered from another host, the metric schema should have a tag
|
||||
named "source" that contains the other host's name. See [this feature
|
||||
request](https://github.com/influxdata/telegraf/issues/4413) for details.
|
||||
|
||||
The metric schema doesn't need to have a tag for the host running
|
||||
telegraf. Telegraf agent code can add a tag named "host" and by default
|
||||
containing the hostname reported by the kernel. This can be configured through
|
||||
the "hostname" and "omit_hostname" agent settings.
|
||||
|
||||
## Go Best Practices
|
||||
|
||||
In general code should follow best practice describe in [Code Review
|
||||
Comments](https://github.com/golang/go/wiki/CodeReviewComments).
|
||||
|
||||
### Networking
|
||||
|
||||
All network operations should have appropriate timeouts. The ability to
|
||||
cancel the option, preferably using a context, is desirable but not always
|
||||
worth the implementation complexity.
|
||||
|
||||
### Channels
|
||||
|
||||
Channels should be used in judiciously as they often complicate the design and
|
||||
can easily be used improperly. Only use them when they are needed.
|
81
docs/developers/SAMPLE_CONFIG.md
Normal file
81
docs/developers/SAMPLE_CONFIG.md
Normal file
|
@ -0,0 +1,81 @@
|
|||
# Sample Configuration
|
||||
|
||||
The sample config file is generated from a results of the `SampleConfig()` functions of the plugin.
|
||||
|
||||
You can generate a full sample
|
||||
config:
|
||||
|
||||
```shell
|
||||
telegraf config
|
||||
```
|
||||
|
||||
You can also generate the config for a particular plugin using the `-usage`
|
||||
option:
|
||||
|
||||
```shell
|
||||
telegraf --usage influxdb
|
||||
```
|
||||
|
||||
## Style
|
||||
|
||||
In the config file we use 2-space indention. Since the config is
|
||||
[TOML](https://github.com/toml-lang/toml) the indention has no meaning.
|
||||
|
||||
Documentation is double commented, full sentences, and ends with a period.
|
||||
|
||||
```toml
|
||||
## This text describes what an the exchange_type option does.
|
||||
# exchange_type = "topic"
|
||||
```
|
||||
|
||||
Try to give every parameter a default value whenever possible. If a
|
||||
parameter does not have a default or must frequently be changed then have it
|
||||
uncommented.
|
||||
|
||||
```toml
|
||||
## Brokers are the AMQP brokers to connect to.
|
||||
brokers = ["amqp://localhost:5672"]
|
||||
```
|
||||
|
||||
Options where the default value is usually sufficient are normally commented
|
||||
out. The commented out value is the default.
|
||||
|
||||
```toml
|
||||
## What an exchange type is.
|
||||
# exchange_type = "topic"
|
||||
```
|
||||
|
||||
If you want to show an example of a possible setting filled out that is
|
||||
different from the default, show both:
|
||||
|
||||
```toml
|
||||
## Static routing key. Used when no routing_tag is set or as a fallback
|
||||
## when the tag specified in routing tag is not found.
|
||||
## example: routing_key = "telegraf"
|
||||
# routing_key = ""
|
||||
```
|
||||
|
||||
Unless parameters are closely related, add a space between them. Usually
|
||||
parameters is closely related have a single description.
|
||||
|
||||
```toml
|
||||
## If true, queue will be declared as an exclusive queue.
|
||||
# queue_exclusive = false
|
||||
|
||||
## If true, queue will be declared as an auto deleted queue.
|
||||
# queue_auto_delete = false
|
||||
|
||||
## Authentication credentials for the PLAIN auth_method.
|
||||
# username = ""
|
||||
# password = ""
|
||||
```
|
||||
|
||||
Parameters should usually be describable in a few sentences. If it takes
|
||||
much more than this, try to provide a shorter explanation and provide a more
|
||||
complex description in the Configuration section of the plugins
|
||||
[README](https://github.com/influxdata/telegraf/tree/master/plugins/inputs/example)
|
||||
|
||||
Boolean parameters should be used judiciously. You should try to think of
|
||||
something better since they don't scale well, things are often not truly
|
||||
boolean, and frequently end up with implicit dependencies: this option does
|
||||
something if this and this are also set.
|
145
docs/developers/STATE_PERSISTENCE.md
Normal file
145
docs/developers/STATE_PERSISTENCE.md
Normal file
|
@ -0,0 +1,145 @@
|
|||
# State-persistence for plugins
|
||||
|
||||
## Purpose
|
||||
|
||||
Plugin state-persistence allows a plugin to save its state across restarts of
|
||||
Telegraf. This might be necessary if data-input (or output) is stateful and
|
||||
depends on the result of a previous operation.
|
||||
|
||||
If you for example query data from a service providing a `next` token, your
|
||||
plugin would need to know the last token received in order to make the next
|
||||
query. However, this token is lost after a restart of Telegraf if not persisted
|
||||
and thus your only chance is to restart the query chain potentially resulting
|
||||
in handling redundant data producing unnecessary traffic.
|
||||
|
||||
This is where state-persistence comes into play. The state-persistence framework
|
||||
allows your plugin to store a _state_ on shutdown and load that _state_ again
|
||||
on startup of Telegraf.
|
||||
|
||||
## State format
|
||||
|
||||
The _state_ of a plugin can be any structure or datatype that is serializable
|
||||
using Golang's JSON serializer. It can be a key-value map or a more complex
|
||||
structure. E.g.
|
||||
|
||||
```go
|
||||
type MyState struct {
|
||||
CurrentToken string
|
||||
LastToken string
|
||||
NextToken string
|
||||
FilterIDs []int64
|
||||
}
|
||||
```
|
||||
|
||||
would represent a valid state.
|
||||
|
||||
## Implementation
|
||||
|
||||
To enable state-persistence in your plugin you need to implement the
|
||||
`StatefulPlugin` interface defined in `plugin.go`. The interface looks as
|
||||
follows:
|
||||
|
||||
```go
|
||||
type StatefulPlugin interface {
|
||||
GetState() interface{}
|
||||
SetState(state interface{}) error
|
||||
}
|
||||
```
|
||||
|
||||
The `GetState()` function should return the current state of the plugin
|
||||
(see [state format](#state-format)). Please note that this function should
|
||||
_always_ succeed and should always be callable directly after `Init()`. So make
|
||||
sure your relevant data-structures are initialized in `Init` to prevent panics.
|
||||
|
||||
Telegraf will call the `GetState()` function on shutdown and will then compile
|
||||
an overall Telegraf state from the information of all stateful plugins. This
|
||||
state is then persisted to disk if (and only if) the `statefile` option in the
|
||||
`agent` section is set. You do _not_ need take care of any serialization or
|
||||
writing, Telegraf will handle this for you.
|
||||
|
||||
When starting Telegraf, the overall persisted Telegraf state will be restored,
|
||||
if `statefile` is set. To do so, the `SetState()` function is called with the
|
||||
deserialized state of the plugin. Please note that this function is called
|
||||
directly _after_ the `Init()` function of your plugin. You need to make sure
|
||||
that the given state is what you expect using a type-assertion! Make sure this
|
||||
won't panic but rather return a meaningful error.
|
||||
|
||||
To assign the state to the correct plugin, Telegraf relies on a plugin ID.
|
||||
See the ["State assignment" section](#state-assignment) for more details on
|
||||
the procedure and ["Plugin Identifier" section](#plugin-identifier) for more
|
||||
details on ID generation.
|
||||
|
||||
## State assignment
|
||||
|
||||
When restoring the state on loading, Telegraf needs to ensure that each plugin
|
||||
_instance_ gets the correct state. To do so, a plugin ID is used. By default
|
||||
this ID is generated automatically for each plugin instance but can be
|
||||
overwritten if necessary (see [Plugin Identifier](#plugin-identifier)).
|
||||
|
||||
State assignment needs to be able to handle multiple instances of the same
|
||||
plugin type correctly, e.g. if the user has configured multiple instances of
|
||||
your plugin with different `server` settings. Here, the state saved for
|
||||
`foo.example.com` needs to be restored to the plugin instance handling
|
||||
`foo.example.com` on next startup of Telegraf and should _not_ end up at server
|
||||
`bar.example.com`. So the plugin identifier used for the assignment should be
|
||||
consistent over restarts of Telegraf.
|
||||
|
||||
In case plugin instances are added to the configuration between restarts, no
|
||||
state is restored _for those instances_. Furthermore, all states referencing
|
||||
plugin identifier that are no-longer valid are dropped and will be ignored. This
|
||||
can happen in case plugin instances are removed or changed in ID.
|
||||
|
||||
## Plugin Identifier
|
||||
|
||||
As outlined above, the plugin identifier (plugin ID) is crucial when assigning
|
||||
states to plugin instances. By default, Telegraf will automatically generate an
|
||||
identifier for each plugin configured when starting up. The ID is consistent
|
||||
over restarts of Telegraf and is based on the _entire configuration_ of the
|
||||
plugin. This means for each plugin instance, all settings in the configuration
|
||||
will be concatenated and hashed to derive the ID. The resulting ID will then be
|
||||
used in both save and restore operations making sure the state ends up in a
|
||||
plugin with _exactly_ the same configuration that created the state.
|
||||
|
||||
However, this also means that the plugin identifier _is changing_ whenever _any_
|
||||
of the configuration setting is changed! For example if your plugin is defined
|
||||
as
|
||||
|
||||
```go
|
||||
type MyPlugin struct {
|
||||
Server string `toml:"server"`
|
||||
Token string `toml:"token"`
|
||||
Timeout config.Duration `toml:"timeout"`
|
||||
|
||||
offset int
|
||||
}
|
||||
```
|
||||
|
||||
with `offset` being your state, the plugin ID will change if a user changes the
|
||||
`timeout` setting in the configuration file. As a consequence the state cannot
|
||||
be restored. This might be undesirable for your plugin, therefore you can
|
||||
overwrite the ID generation by implementing the `PluginWithID` interface (see
|
||||
`plugin.go`). This interface defines a `ID() string` function returning the
|
||||
identifier o the current plugin _instance_. When implementing this function you
|
||||
should take the following criteria into account:
|
||||
|
||||
1. The identifier has to be _unique_ for your plugin _instance_ (not only for
|
||||
the plugin type) to make sure the state is assigned to the correct instance.
|
||||
1. The identifier has to be _consistent_ across startups/restarts of Telegraf
|
||||
as otherwise the state cannot be restored. Make sure the order of
|
||||
configuration settings doesn't matter.
|
||||
1. Make sure to _include all settings relevant for state assignment_. In
|
||||
the example above, the plugin's `token` setting might or might not be
|
||||
relevant to identify the plugin instance.
|
||||
1. Make sure to _leave out all settings irrelevant for state assignment_. In
|
||||
the example above, the plugin's `timeout` setting likely is not relevant
|
||||
for the state and can be left out.
|
||||
|
||||
Which settings are relevant for the state are plugin specific. For example, if
|
||||
the `offset` is a property of the _server_ the `token` setting is irrelevant.
|
||||
However, if the `offset` is specific for a certain user suddenly the `token`
|
||||
setting is relevant.
|
||||
|
||||
Alternatively to generating an identifier automatically, the plugin can allow
|
||||
the user to specify that ID directly in a configuration setting. However, please
|
||||
not that this might lead to colliding IDs in larger setups and should thus be
|
||||
avoided.
|
Loading…
Add table
Add a link
Reference in a new issue