1
0
Fork 0

Adding upstream version 1.34.4.

Signed-off-by: Daniel Baumann <daniel@debian.org>
This commit is contained in:
Daniel Baumann 2025-05-24 07:26:29 +02:00
parent e393c3af3f
commit 4978089aab
Signed by: daniel
GPG key ID: FBB4F0E80A80222F
4963 changed files with 677545 additions and 0 deletions

View file

@ -0,0 +1,291 @@
# Starlark Processor Plugin
The `starlark` processor calls a Starlark function for each matched metric,
allowing for custom programmatic metric processing.
The Starlark language is a dialect of Python, and will be familiar to those who
have experience with the Python language. However, there are major
[differences](#python-differences). Existing Python code is unlikely to work
unmodified. The execution environment is sandboxed, and it is not possible to
do I/O operations such as reading from files or sockets.
The **[Starlark specification][]** has details about the syntax and available
functions.
Telegraf minimum version: Telegraf 1.15.0
## Global configuration options <!-- @/docs/includes/plugin_config.md -->
In addition to the plugin-specific configuration settings, plugins support
additional global and plugin configuration settings. These settings are used to
modify metrics, tags, and field or create aliases and configure ordering, etc.
See the [CONFIGURATION.md][CONFIGURATION.md] for more details.
[CONFIGURATION.md]: ../../../docs/CONFIGURATION.md#plugins
## Configuration
```toml @sample.conf
# Process metrics using a Starlark script
[[processors.starlark]]
## The Starlark source can be set as a string in this configuration file, or
## by referencing a file containing the script. Only one source or script
## should be set at once.
## Source of the Starlark script.
source = '''
def apply(metric):
return metric
'''
## File containing a Starlark script.
# script = "/usr/local/bin/myscript.star"
## The constants of the Starlark script.
# [processors.starlark.constants]
# max_size = 10
# threshold = 0.75
# default_name = "Julia"
# debug_mode = true
```
## Usage
The Starlark code should contain a function called `apply` that takes a metric
as its single argument. The function will be called with each metric, and can
return `None`, a single metric, or a list of metrics.
```python
def apply(metric):
return metric
```
For a list of available types and functions that can be used in the code, see
the [Starlark specification][].
In addition to these, the following InfluxDB-specific
types and functions are exposed to the script.
- **Metric(*name*)**:
Create a new metric with the given measurement name. The metric will have no
tags or fields and defaults to the current time.
- **name**:
The name is a [string][] containing the metric measurement name.
- **tags**:
A [dict-like][dict] object containing the metric's tags.
- **fields**:
A [dict-like][dict] object containing the metric's fields. The values may be
of type int, float, string, or bool.
- **time**:
The timestamp of the metric as an integer in nanoseconds since the Unix
epoch.
- **deepcopy(*metric*, *track=false*)**:
Copy an existing metric with or without tracking information. If `track` is set
to `true`, the tracking information is copied.
**Caution:** Make sure to always return *all* metrics with tracking information!
Otherwise, the corresponding inputs will never receive the delivery information
and potentially overrun!
### Python Differences
While Starlark is similar to Python, there are important differences to note:
- Starlark has limited support for error handling and no exceptions. If an
error occurs the script will immediately end and Telegraf will drop the
metric. Check the Telegraf logfile for details about the error.
- It is not possible to import other packages and the Python standard library
is not available.
- It is not possible to open files or sockets.
- These common keywords are **not supported** in the Starlark grammar:
```text
as finally nonlocal
assert from raise
class global try
del import with
except is yield
```
### Libraries available
The ability to load external scripts other than your own is pretty limited. The
following libraries are available for loading:
- json: `load("json.star", "json")` provides the following functions: `json.encode()`, `json.decode()`, `json.indent()`. See [json.star](testdata/json.star) for an example. For more details about the functions, please refer to [the documentation of this library](https://pkg.go.dev/go.starlark.net/lib/json).
- log: `load("logging.star", "log")` provides the following functions: `log.debug()`, `log.info()`, `log.warn()`, `log.error()`. See [logging.star](testdata/logging.star) for an example.
- math: `load("math.star", "math")` provides [the following functions and constants](https://pkg.go.dev/go.starlark.net/lib/math). See [math.star](testdata/math.star) for an example.
- time: `load("time.star", "time")` provides the following functions: `time.from_timestamp()`, `time.is_valid_timezone()`, `time.now()`, `time.parse_duration()`, `time.parse_time()`, `time.time()`. See [time_date.star](testdata/time_date.star), [time_duration.star](testdata/time_duration.star) and/or [time_timestamp.star](testdata/time_timestamp.star) for an example. For more details about the functions, please refer to [the documentation of this library](https://pkg.go.dev/go.starlark.net/lib/time).
If you would like to see support for something else here, please open an issue.
### Common Questions
**What's the performance cost to using Starlark?**
In local tests, it takes about 1µs (1 microsecond) to run a modest script to
process one metric. This is going to vary with the size of your script, but the
total impact is minimal. At this pace, it's likely not going to be the
bottleneck in your Telegraf setup.
**How can I drop/delete a metric?**
If you don't return the metric it will be deleted. Usually this means the
function should `return None`.
**How should I make a copy of a metric?**
Use `deepcopy(metric)` to create a copy of the metric.
**How can I return multiple metrics?**
You can return a list of metrics:
```python
def apply(metric):
m2 = deepcopy(metric)
return [metric, m2]
```
**What happens to a tracking metric if an error occurs in the script?**
The metric is marked as undelivered.
**How do I create a new metric?**
Use the `Metric(name)` function and set at least one field.
**What is the fastest way to iterate over tags/fields?**
The fastest way to iterate is to use a for-loop on the tags or fields attribute:
```python
def apply(metric):
for k in metric.tags:
pass
return metric
```
When you use this form, it is not possible to modify the tags inside the loop,
if this is needed you should use one of the `.keys()`, `.values()`, or
`.items()` methods:
```python
def apply(metric):
for k, v in metric.tags.items():
pass
return metric
```
**How can I save values across multiple calls to the script?**
Telegraf freezes the global scope, which prevents it from being modified, except
for a special shared global dictionary named `state`, this can be used by the
`apply` function. See an example of this in [compare with previous
metric](testdata/compare_metrics.star)
Other than the `state` variable, attempting to modify the global scope will fail
with an error.
**How to manage errors that occur in the apply function?**
In case you need to call some code that may return an error, you can delegate
the call to the built-in function `catch` which takes as argument a `Callable`
and returns the error that occurred if any, `None` otherwise.
So for example:
```python
load("json.star", "json")
def apply(metric):
error = catch(lambda: failing(metric))
if error != None:
# Some code to execute in case of an error
metric.fields["error"] = error
return metric
def failing(metric):
json.decode("non-json-content")
```
**How to reuse the same script but with different parameters?**
In case you have a generic script that you would like to reuse for different
instances of the plugin, you can use constants as input parameters of your
script.
So for example, assuming that you have the next configuration:
```toml
[[processors.starlark]]
script = "/usr/local/bin/myscript.star"
[processors.starlark.constants]
somecustomnum = 10
somecustomstr = "mycustomfield"
```
Your script could then use the constants defined in the configuration as
follows:
```python
def apply(metric):
if metric.fields[somecustomstr] >= somecustomnum:
metric.fields.clear()
return metric
```
**What does `cannot represent integer ...` mean?**
The error occurs if an integer value in starlark exceeds the signed 64 bit
integer limit. This can occur if you are summing up large values in a starlark
integer value or convert an unsigned 64 bit integer to starlark and then create
a new metric field from it.
This is due to the fact that integer values in starlark are *always* signed and
can grow beyond the 64-bit size. Therefore converting the value back fails in
the cases mentioned above.
As a workaround you can either clip the field value at the signed 64-bit limit
or return the value as a floating-point number.
### Examples
- [drop string fields](testdata/drop_string_fields.star) - Drop fields containing string values.
- [drop fields with unexpected type](testdata/drop_fields_with_unexpected_type.star) - Drop fields containing unexpected value types.
- [iops](testdata/iops.star) - obtain IOPS (to aggregate, to produce max_iops)
- [json](testdata/json.star) - an example of processing JSON from a field in a metric
- [math](testdata/math.star) - Use a math function to compute the value of a field. [The list of the supported math functions and constants](https://pkg.go.dev/go.starlark.net/lib/math).
- [number logic](testdata/number_logic.star) - transform a numerical value to another numerical value
- [pivot](testdata/pivot.star) - Pivots a key's value to be the key for another key.
- [ratio](testdata/ratio.star) - Compute the ratio of two integer fields
- [rename](testdata/rename.star) - Rename tags or fields using a name mapping.
- [scale](testdata/scale.star) - Multiply any field by a number
- [time date](testdata/time_date.star) - Parse a date and extract the year, month and day from it.
- [time duration](testdata/time_duration.star) - Parse a duration and convert it into a total amount of seconds.
- [time timestamp](testdata/time_timestamp.star) - Filter metrics based on the timestamp in seconds.
- [time timestamp nanoseconds](testdata/time_timestamp_nanos.star) - Filter metrics based on the timestamp with nanoseconds.
- [time timestamp current](testdata/time_set_timestamp.star) - Setting the metric timestamp to the current/local time.
- [value filter](testdata/value_filter.star) - Remove a metric based on a field value.
- [logging](testdata/logging.star) - Log messages with the logger of Telegraf
- [multiple metrics](testdata/multiple_metrics.star) - Return multiple metrics by using [a list](https://docs.bazel.build/versions/master/skylark/lib/list.html) of metrics.
- [multiple metrics from json array](testdata/multiple_metrics_with_json.star) - Builds a new metric from each element of a json array then returns all the created metrics.
- [custom error](testdata/fail.star) - Return a custom error with [fail](https://docs.bazel.build/versions/master/skylark/lib/globals.html#fail).
- [compare with previous metric](testdata/compare_metrics.star) - Compare the current metric with the previous one using the shared state.
- [rename prometheus remote write](testdata/rename_prometheus_remote_write.star) - Rename prometheus remote write measurement name with fieldname and rename fieldname to value.
[All examples](testdata) are in the testdata folder.
Open a Pull Request to add any other useful Starlark examples.
[Starlark specification]: https://github.com/google/starlark-go/blob/d1966c6b9fcd/doc/spec.md
[string]: https://github.com/google/starlark-go/blob/d1966c6b9fcd/doc/spec.md#strings
[dict]: https://github.com/google/starlark-go/blob/d1966c6b9fcd/doc/spec.md#dictionaries

View file

@ -0,0 +1,21 @@
# Process metrics using a Starlark script
[[processors.starlark]]
## The Starlark source can be set as a string in this configuration file, or
## by referencing a file containing the script. Only one source or script
## should be set at once.
## Source of the Starlark script.
source = '''
def apply(metric):
return metric
'''
## File containing a Starlark script.
# script = "/usr/local/bin/myscript.star"
## The constants of the Starlark script.
# [processors.starlark.constants]
# max_size = 10
# threshold = 0.75
# default_name = "Julia"
# debug_mode = true

View file

@ -0,0 +1,142 @@
//go:generate ../../../tools/readme_config_includer/generator
package starlark
import (
_ "embed"
"errors"
"fmt"
"go.starlark.net/starlark"
"github.com/influxdata/telegraf"
common "github.com/influxdata/telegraf/plugins/common/starlark"
"github.com/influxdata/telegraf/plugins/processors"
)
//go:embed sample.conf
var sampleConfig string
type Starlark struct {
common.Common
results []telegraf.Metric
}
func (*Starlark) SampleConfig() string {
return sampleConfig
}
func (s *Starlark) Init() error {
if err := s.Common.Init(); err != nil {
return err
}
// The source should define an apply function.
if err := s.AddFunction("apply", &common.Metric{}); err != nil {
return err
}
// Preallocate a slice for return values.
s.results = make([]telegraf.Metric, 0, 10)
return nil
}
func (*Starlark) Start(telegraf.Accumulator) error {
return nil
}
func (s *Starlark) Add(origMetric telegraf.Metric, acc telegraf.Accumulator) error {
parameters, found := s.GetParameters("apply")
if !found {
return errors.New("the parameters of the apply function could not be found")
}
parameters[0].(*common.Metric).Wrap(origMetric)
returnValue, err := s.Call("apply")
if err != nil {
s.LogError(err)
return err
}
switch rv := returnValue.(type) {
case *starlark.List:
iter := rv.Iterate()
defer iter.Done()
var v starlark.Value
var origFound bool
for iter.Next(&v) {
switch v := v.(type) {
case *common.Metric:
m := v.Unwrap()
if containsMetric(s.results, m) {
s.Log.Errorf("Duplicate metric reference detected")
continue
}
// Previous metric was found, accept the starlark metric, add
// the original metric to the accumulator
if v.ID != 0 {
origFound = true
s.results = append(s.results, origMetric)
acc.AddMetric(origMetric)
continue
}
s.results = append(s.results, m)
acc.AddMetric(m)
default:
s.Log.Errorf("Invalid type returned in list: %s", v.Type())
}
}
// If the script didn't return the original metrics, mark it as
// successfully handled.
if !origFound {
origMetric.Drop()
}
// clear results
for i := range s.results {
s.results[i] = nil
}
s.results = s.results[:0]
case *common.Metric:
m := rv.Unwrap()
// If we got the original metric back, use that and drop the new one.
// Otherwise mark the original as accepted and use the new metric.
if rv.ID != 0 {
acc.AddMetric(origMetric)
} else {
origMetric.Accept()
acc.AddMetric(m)
}
case starlark.NoneType:
origMetric.Drop()
default:
return fmt.Errorf("invalid type returned: %T", rv)
}
return nil
}
func (*Starlark) Stop() {}
func containsMetric(metrics []telegraf.Metric, target telegraf.Metric) bool {
for _, m := range metrics {
if m == target {
return true
}
}
return false
}
func init() {
processors.AddStreaming("starlark", func() telegraf.StreamingProcessor {
return &Starlark{
Common: common.Common{
StarlarkLoadFunc: common.LoadFunc,
},
}
})
}

File diff suppressed because it is too large Load diff

View file

@ -0,0 +1,26 @@
# Example showing how to keep the last metric in order to compare it with the new one.
#
# Example Input:
# cpu value=10i 1465839830100400201
# cpu value=8i 1465839830100400301
#
# Example Output:
# cpu_diff value=2i 1465839830100400301
state = {
"last": None
}
def apply(metric):
# Load from the shared state the metric assigned to the key "last"
last = state["last"]
# Store the deepcopy of the new metric into the shared state and assign it to the key "last"
# NB: To store a metric into the shared state you have to deep copy it
state["last"] = deepcopy(metric)
if last != None:
# Create a new metric named "cpu_diff"
result = Metric("cpu_diff")
# Set the field "value" to the difference between the value of the last metric and the current one
result.fields["value"] = last.fields["value"] - metric.fields["value"]
result.time = metric.time
return result

View file

@ -0,0 +1,30 @@
# Drop fields if they NOT contain values of an expected type.
#
# In this example we ignore fields with an unknown expected type and do not drop them.
#
# Example Input:
# measurement,host=hostname a=1i,b=4.2,c=42.0,d="v3.14",e=true,f=23.0 1597255410000000000
# measurement,host=hostname a=1i,b="somestring",c=42.0,d="v3.14",e=true,f=23.0 1597255410000000000
#
# Example Output:
# measurement,host=hostname a=1i,b=4.2,c=42.0,d="v3.14",e=true,f=23.0 1597255410000000000
# measurement,host=hostname a=1i,c=42.0,d="v3.14",e=true,f=23.0 1597255410000000000
load("logging.star", "log")
# loads log.debug(), log.info(), log.warn(), log.error()
expected_type = {
"a": "int",
"b": "float",
"c": "float",
"d": "string",
"e": "bool"
}
def apply(metric):
for k, v in metric.fields.items():
if type(v) != expected_type.get(k, type(v)):
metric.fields.pop(k)
log.warn("Unexpected field type dropped: metric {} had field {} with type {}, but it is expected to be {}".format(metric.name, k, type(v), expected_type.get(k, type(v))))
return metric

View file

@ -0,0 +1,14 @@
# Drop fields if they contain a string.
#
# Example Input:
# measurement,host=hostname a=1,b="somestring" 1597255410000000000
#
# Example Output:
# measurement,host=hostname a=1 1597255410000000000
def apply(metric):
for k, v in metric.fields.items():
if type(v) == "string":
metric.fields.pop(k)
return metric

View file

@ -0,0 +1,13 @@
# Example of the way to return a custom error thanks to the built-in function fail
# Returning an error will drop the current metric. Consider using logging instead if you want to keep the metric.
#
# Example Input:
# fail value=1 1465839830100400201
#
# Example Output Error:
# fail: The field value should be greater than 1
def apply(metric):
if metric.fields["value"] <= 1:
return fail("The field value should be greater than 1")
return metric

View file

@ -0,0 +1,55 @@
# Example showing how to obtain IOPS (to aggregate, to produce max_iops). Input can be produced by:
#
#[[inputs.diskio]]
# alias = "diskio1s"
# interval = "1s"
# fieldinclude = ["reads", "writes"]
# name_suffix = "1s"
#
# Example Input:
# diskio1s,host=hostname,name=diska reads=0i,writes=0i 1554079521000000000
# diskio1s,host=hostname,name=diska reads=0i,writes=0i 1554079522000000000
# diskio1s,host=hostname,name=diska reads=110i,writes=0i 1554079523000000000
# diskio1s,host=hostname,name=diska reads=110i,writes=30i 1554079524000000000
# diskio1s,host=hostname,name=diska reads=160i,writes=70i 1554079525000000000
#
# Example Output:
# diskiops,host=hostname,name=diska readsps=0,writesps=0,iops=0 1554079522000000000
# diskiops,host=hostname,name=diska readsps=110,writesps=0,iops=110 1554079523000000000
# diskiops,host=hostname,name=diska readsps=0,writesps=30,iops=30 1554079524000000000
# diskiops,host=hostname,name=diska readsps=50,writesps=40,iops=90 1554079525000000000
state = { }
def apply(metric):
disk_name = metric.tags["name"]
# Load from the shared last_state the metric for the disk name
last = state.get(disk_name)
# Store the deepcopy of the new metric into the shared last_state and assign it to the key "last"
# NB: To store a metric into the shared last_state you have to deep copy it
state[disk_name] = deepcopy(metric)
if last != None:
# Create the new metrics
diskiops = Metric("diskiops")
# Calculate reads/writes per second
reads = metric.fields["reads"] - last.fields["reads"]
writes = metric.fields["writes"] - last.fields["writes"]
io = reads + writes
interval_seconds = ( metric.time - last.time ) / 1000000000
diskiops.fields["readsps"] = ( reads / interval_seconds )
diskiops.fields["writesps"] = ( writes / interval_seconds )
diskiops.fields["iops"] = ( io / interval_seconds )
diskiops.tags["name"] = disk_name
diskiops.tags["host"] = metric.tags["host"]
diskiops.time = metric.time
return diskiops
# This could be aggregated to obtain max IOPS using:
#
# [[aggregators.basicstats]]
# namepass = ["diskiops"]
# period = "60s"
# drop_original = true
# stats = ["max"]
#
# diskiops,host=hostname,name=diska readsps_max=110,writesps_max=40,iops_max=110 1554079525000000000

View file

@ -0,0 +1,18 @@
# Example of parsing json out of a field and modifying the metric with it.
# this is great to use in conjunction with the value parser.
#
# Example Input:
# json value="{\"label\": \"hero\", \"count\": 14}" 1465839830100400201
#
# Example Output:
# json,label=hero count=14i 1465839830100400201
load("json.star", "json")
# loads json.encode(), json.decode(), json.indent()
def apply(metric):
j = json.decode(metric.fields.get('value'))
metric.fields.pop('value')
metric.tags["label"] = j["label"]
metric.fields["count"] = j["count"]
return metric

View file

@ -0,0 +1,46 @@
#
# This code assumes the value parser with data_type='string' is used
# in the input collecting the JSON data. The entire JSON obj/doc will
# be set to a Field named `value` with which this code will work.
# JSON:
# ```
# {
# "fields": {
# "LogEndOffset": 339238,
# "LogStartOffset": 339238,
# "NumLogSegments": 1,
# "Size": 0,
# "UnderReplicatedPartitions": 0
# },
# "name": "partition",
# "tags": {
# "host": "CUD1-001559",
# "jolokia_agent_url": "http://localhost:7777/jolokia",
# "partition": "1",
# "topic": "qa-kafka-connect-logs"
# },
# "timestamp": 1591124461
# } ```
#
# Example Input:
# json value="[{\"fields\": {\"LogEndOffset\": 339238, \"LogStartOffset\": 339238, \"NumLogSegments\": 1, \"Size\": 0, \"UnderReplicatedPartitions\": 0}, \"name\": \"partition\", \"tags\": {\"host\": \"CUD1-001559\", \"jolokia_agent_url\": \"http://localhost:7777/jolokia\", \"partition\": \"1\", \"topic\": \"qa-kafka-connect-logs\"}, \"timestamp\": 1591124461}]"
# Example Output:
# partition,host=CUD1-001559,jolokia_agent_url=http://localhost:7777/jolokia,partition=1,topic=qa-kafka-connect-logs LogEndOffset=339238i,LogStartOffset=339238i,NumLogSegments=1i,Size=0i,UnderReplicatedPartitions=0i 1591124461000000000
load("json.star", "json")
def apply(metric):
j_list = json.decode(metric.fields.get('value')) # input JSON may be an arrow of objects
metrics = []
for obj in j_list:
new_metric = Metric("partition") # We want a new InfluxDB/Telegraf metric each iteration
for tag in obj["tags"].items(): # 4 Tags to iterate through
new_metric.tags[str(tag[0])] = tag[1]
for field in obj["fields"].items(): # 5 Fields to iterate through
new_metric.fields[str(field[0])] = field[1]
new_metric.time = int(obj["timestamp"] * 1e9)
metrics.append(new_metric)
return metrics

View file

@ -0,0 +1,19 @@
# Example of the way to log a message with all the supported levels
# using the logger of Telegraf.
#
# Example Input:
# log debug="a debug message" 1465839830100400201
#
# Example Output:
# log debug="a debug message" 1465839830100400201
load("logging.star", "log")
# loads log.debug(), log.info(), log.warn(), log.error()
def apply(metric):
log.debug("debug: {}".format(metric.fields["debug"]))
log.info("an info message")
log.warn("a warning message")
log.error("an error message")
return metric

View file

@ -0,0 +1,14 @@
# Example showing how the math module can be used to compute the value of a field.
#
# Example Input:
# math value=10000i 1465839830100400201
#
# Example Output:
# math result=4 1465839830100400201
load('math.star', 'math')
# loads all the functions and constants defined in the math module
def apply(metric):
metric.fields["result"] = math.log(metric.fields.pop('value'), 10)
return metric

View file

@ -0,0 +1,26 @@
# Example showing how to create several metrics using the Starlark processor.
#
# Example Input:
# mm value="a" 1465839830100400201
#
# Example Output:
# mm2 value="b" 1465839830100400201
# mm1 value="a" 1465839830100400201
def apply(metric):
# Initialize a list of metrics
metrics = []
# Create a new metric whose name is "mm2"
metric2 = Metric("mm2")
# Set the field "value" to b
metric2.fields["value"] = "b"
# Reset the time (only needed for testing purpose)
metric2.time = metric.time
# Add metric2 to the list of metrics
metrics.append(metric2)
# Rename the original metric to "mm1"
metric.name = "mm1"
# Add metric to the list of metrics
metrics.append(metric)
# Return the created list of metrics
return metrics

View file

@ -0,0 +1,27 @@
# Example showing how to create several metrics from a json array.
#
# Example Input:
# json value="[{\"label\": \"hello\"}, {\"label\": \"world\"}]"
#
# Example Output:
# json value="hello" 1618488000000000999
# json value="world" 1618488000000000999
# loads json.encode(), json.decode(), json.indent()
load("json.star", "json")
load("time.star", "time")
def apply(metric):
# Initialize a list of metrics
metrics = []
# Loop over the json array stored into the field
for obj in json.decode(metric.fields['value']):
# Create a new metric whose name is "json"
current_metric = Metric("json")
# Set the field "value" to the label extracted from the current json object
current_metric.fields["value"] = obj["label"]
# Reset the time (only needed for testing purpose)
current_metric.time = time.now().unix_nano
# Add metric to the list of metrics
metrics.append(current_metric)
return metrics

View file

@ -0,0 +1,17 @@
# Set a logic function to transform a numerical value to another numerical value
# Example: Set any 'status' field between 1 and 6 to a value of 0
#
# Example Input:
# lb,http_method=GET status=5i 1465839830100400201
#
# Example Output:
# lb,http_method=GET status=0i 1465839830100400201
def apply(metric):
v = metric.fields.get('status')
if v == None:
return metric
if 1 < v and v < 6:
metric.fields['status'] = 0
return metric

View file

@ -0,0 +1,17 @@
'''
Pivots a key's value to be the key for another key.
In this example it pivots the value of key `sensor`
to be the key of the value in key `value`
Example Input:
temperature sensor="001A0",value=111.48 1618488000000000999
Example Output:
temperature 001A0=111.48 1618488000000000999
'''
def apply(metric):
metric.fields[str(metric.fields['sensor'])] = metric.fields['value']
metric.fields.pop('value',None)
metric.fields.pop('sensor',None)
return metric

View file

@ -0,0 +1,15 @@
# Compute the ratio of two integer fields.
#
# Example: A new field 'usage' from an existing fields 'used' and 'total'
#
# Example Input:
# memory,host=hostname used=11038756864.4948,total=17179869184.1221 1597255082000000000
#
# Example Output:
# memory,host=hostname used=11038756864.4948,total=17179869184.1221,usage=64.25402164701573 1597255082000000000
def apply(metric):
used = float(metric.fields['used'])
total = float(metric.fields['total'])
metric.fields['usage'] = (used / total) * 100
return metric

View file

@ -0,0 +1,23 @@
# Rename any tags using the mapping in the renames dict.
#
# Example Input:
# measurement,host=hostname lower=0,upper=100 1597255410000000000
#
# Example Output:
# measurement,host=hostname min=0,max=100 1597255410000000000
renames = {
'lower': 'min',
'upper': 'max',
}
def apply(metric):
for k, v in metric.tags.items():
if k in renames:
metric.tags[renames[k]] = v
metric.tags.pop(k)
for k, v in metric.fields.items():
if k in renames:
metric.fields[renames[k]] = v
metric.fields.pop(k)
return metric

View file

@ -0,0 +1,16 @@
# Specifically for prometheus remote write - renames the measurement name to the fieldname. Renames the fieldname to value.
# Assumes there is only one field as is the case for prometheus remote write.
#
# Example Input:
# prometheus_remote_write,instance=localhost:9090,job=prometheus,quantile=0.99 go_gc_duration_seconds=4.63 1618488000000000999
#
# Example Output:
# go_gc_duration_seconds,instance=localhost:9090,job=prometheus,quantile=0.99 value=4.63 1618488000000000999
def apply(metric):
if metric.name == "prometheus_remote_write":
for k, v in metric.fields.items():
metric.name = k
metric.fields["value"] = v
metric.fields.pop(k)
return metric

View file

@ -0,0 +1,13 @@
# Multiply any float fields by 10
#
# Example Input:
# modbus,host=hostname Current=1.22,Energy=0,Frequency=60i,Power=0,Voltage=123.9000015258789 1554079521000000000
#
# Example Output:
# modbus,host=hostname Current=12.2,Energy=0,Frequency=60i,Power=0,Voltage=1239.000015258789 1554079521000000000
def apply(metric):
for k, v in metric.fields.items():
if type(v) == "float":
metric.fields[k] = v * 10
return metric

View file

@ -0,0 +1,96 @@
# Produces a new Line of statistics about the Fields
# Drops the original metric
#
# Example Input:
# logstash,environment_id=EN456,property_id=PR789,request_type=ingress,stack_id=engd asn=1313i,cache_response_code=202i,colo_code="LAX",colo_id=12i,compute_time=28736i,edge_end_timestamp=1611085500320i,edge_start_timestamp=1611085496208i,id="1b5c67ed-dfd0-4d30-99bd-84f0a9c5297b_76af1809-29d1-4b35-a0cf-39797458275c",parent_ray_id="00",processing_details="ok",rate_limit_id=0i,ray_id="76af1809-29d1-4b35-a0cf-39797458275c",request_bytes=7777i,request_host="engd-08364a825824e04f0a494115.reactorstream.dev",request_id="1b5c67ed-dfd0-4d30-99bd-84f0a9c5297b",request_result="succeeded",request_uri="/ENafcb2798a9be4bb7bfddbf35c374db15",response_code=200i,subrequest=false,subrequest_count=1i,user_agent="curl/7.64.1" 1611085496208
#
# Example Output:
# sizing,measurement=logstash,environment_id=EN456,property_id=PR789,request_type=ingress,stack_id=engd tag_count=4,tag_key_avg_length=11.25,tag_value_avg_length=5.25,int_key_avg_length=13.4,int_avg_length=4.9,int_count=10,bool_key_avg_length=10,bool_avg_length=5,bool_count=1,str_key_avg_length=10.5,str_avg_length=25.4,str_count=10 1611085496208
def apply(metric):
new_metric = Metric("sizing")
num_tags = len(metric.tags.items())
new_metric.fields["tag_count"] = float(num_tags)
new_metric.fields["tag_key_avg_length"] = sum(map(len, metric.tags.keys())) / num_tags
new_metric.fields["tag_value_avg_length"] = sum(map(len, metric.tags.values())) / num_tags
new_metric.tags["measurement"] = metric.name
new_metric.tags.update(metric.tags)
ints, floats, bools, strs = [], [], [], []
for field in metric.fields.items():
key, value = field[0], field[1]
if type(value) == "int":
ints.append(field)
elif type(value) == "float":
floats.append(field)
elif type(value) == "bool":
bools.append(field)
elif type(value) == "string":
strs.append(field)
if len(ints) > 0:
int_keys = [i[0] for i in ints]
int_vals = [i[1] for i in ints]
produce_pairs(new_metric, int_keys, "int", key=True)
produce_pairs(new_metric, int_vals, "int")
if len(floats) > 0:
float_keys = [i[0] for i in floats]
float_vals = [i[1] for i in floats]
produce_pairs(new_metric, float_keys, "float", key=True)
produce_pairs(new_metric, float_vals, "float")
if len(bools) > 0:
bool_keys = [i[0] for i in bools]
bool_vals = [i[1] for i in bools]
produce_pairs(new_metric, bool_keys, "bool", key=True)
produce_pairs(new_metric, bool_vals, "bool")
if len(strs) > 0:
str_keys = [i[0] for i in strs]
str_vals = [i[1] for i in strs]
produce_pairs(new_metric, str_keys, "str", key=True)
produce_pairs(new_metric, str_vals, "str")
new_metric.time = metric.time
return new_metric
def produce_pairs(metric, li, field_type, key=False):
lens = elem_lengths(li)
counts = count_lengths(lens)
metric.fields["{}_count".format(field_type)] = float(len(li))
if key:
metric.fields["{}_key_avg_length".format(field_type)] = float(mean(lens))
else:
metric.fields["{}_avg_length".format(field_type)] = float(mean(lens))
def elem_lengths(li):
if type(li[0]) in ("int", "float", "bool"):
return [len(str(elem)) for elem in li]
else:
return [len(elem) for elem in li]
def count_lengths(li):
# Returns dict of counts of each occurrence of length in a list of lengths
lens = []
counts = []
for elem in li:
if elem not in lens:
lens.append(elem)
counts.append(1)
else:
index = lens.index(elem)
counts[index] += 1
return dict(zip(lens, counts))
def map(f, li):
return [f(x) for x in li]
def sum(li):
sum = 0
for i in li:
sum += i
return sum
def mean(li):
return sum(li)/len(li)

View file

@ -0,0 +1,320 @@
# This Starlark processor is used when loading Sparkplug B protobuf #
# messages into InfluxDB. The data source is a Opto22 Groov EPIC controller.
#
# This processor does the following:
# - Resolves the metric name using a numeric alias.
# When the EPIC MQTT client is started it sends a DBIRTH message
# that lists all metrics configured on the controller and includes
# a sequential numeric alias to reference it by.
# This processor stores that information in the array states["aliases"].
# When subsequent DDATA messages are published, the numeric alias is
# used to find the stored metric name in the array states["aliases"].
# - Splits the MQTT topic into 5 fields which can be used as tags in InfluxDB.
# - Splits the metric name into 6 fields which are be used as tags in InfluxDB.
# - Deletes the host, type, topic, name and alias tags
#
# TODO:
# The requirement that a DBIRTH message has to be received before DDATA messages
# can be used creates a significant reliability issue and a debugging mess.
# I have to go into the Groov EPIC controller and restart the MQTT client every time
# I restart the telegraf loader. This has caused many hours of needless frustration.
#
# I see two possible solutions:
# - Opto 22 changes their software making it optional to drop the alias
# and simply include the name in the DDATA messages. In my case it's never more
# than 15 characters. This is the simplest and most reliable solution.
# - Make a system call from telegraf and using SSH to remotely restart the MQTT client.
# - Have telegraf send a message through MQTT requesting a DBIRTH message from the EPIC Controller.
#
# Example Input:
# edge,host=firefly,topic=spBv1.0/SF/DDATA/epiclc/Exp501 type=9i,value=22.247711,alias=10i 1626475876000000000
# edge,host=firefly,topic=spBv1.0/SF/DDATA/epiclc/Exp501 alias=10i,type=9i,value=22.231323 1626475877000000000
# edge,host=firefly,topic=spBv1.0/SF/DBIRTH/epiclc/Exp501 type=9i,name="Strategy/IO/I_Ch_TC_Right",alias=9i 1626475880000000000
# edge,host=firefly,topic=spBv1.0/SF/DBIRTH/epiclc/Exp501 value=22.200958,name="Strategy/IO/I_Ch_TC_Top_C",type=9i,alias=10i 1626475881000000000
# edge,host=firefly,topic=spBv1.0/SF/DDATA/epiclc/Exp501 alias=10i,type=9i,value=22.177643 1626475884000000000
# edge,host=firefly,topic=spBv1.0/SF/DDATA/epiclc/Exp501 type=9i,value=22.231903,alias=10i 1626475885000000000
# edge,host=firefly,topic=spBv1.0/SF/DDATA/epiclc/Exp501 value=22.165192,alias=10i,type=9i 1626475895000000000
# edge,host=firefly,topic=spBv1.0/SF/DDATA/epiclc/Exp501 alias=10i,type=9i,value=22.127106 1626475896000000000
#
# Example Output:
# C,Component=Ch,Datatype=IO,Device=TC,EdgeID=epiclc,Experiment=Exp501,Metric=I_Ch_TC_Top_C,MsgType=DBIRTH,Position=Top,Reactor=SF,Source=Strategy value=22.200958 1626475881000000000
# C,Component=Ch,Datatype=IO,Device=TC,EdgeID=epiclc,Experiment=Exp501,Metric=I_Ch_TC_Top_C,MsgType=DDATA,Position=Top,Reactor=SF,Source=Strategy value=22.177643 1626475884000000000
# C,Component=Ch,Datatype=IO,Device=TC,EdgeID=epiclc,Experiment=Exp501,Metric=I_Ch_TC_Top_C,MsgType=DDATA,Position=Top,Reactor=SF,Source=Strategy value=22.231903 1626475885000000000
# C,Component=Ch,Datatype=IO,Device=TC,EdgeID=epiclc,Experiment=Exp501,Metric=I_Ch_TC_Top_C,MsgType=DDATA,Position=Top,Reactor=SF,Source=Strategy value=22.165192 1626475895000000000
# C,Component=Ch,Datatype=IO,Device=TC,EdgeID=epiclc,Experiment=Exp501,Metric=I_Ch_TC_Top_C,MsgType=DDATA,Position=Top,Reactor=SF,Source=Strategy value=22.127106 1626475896000000000
#############################################
# The following is the telegraf.conf used when calling this processor
# [[inputs.mqtt_consumer]]
# servers = ["tcp://your_server:1883"]
# qos = 0
# connection_timeout = "30s"
# topics = ["spBv1.0/#"]
# persistent_session = false
# client_id = ""
# username = "your username"
# password = "your password"
#
# # Sparkplug protobuf configuration
# data_format = "xpath_protobuf"
#
# # URL of sparkplug protobuf prototype
# xpath_protobuf_type = "org.eclipse.tahu.protobuf.Payload"
#
# # Location of sparkplug_b.proto file
# xpath_protobuf_file = "/apps/telegraf/config/sparkplug_b.proto"
#
# [[inputs.mqtt_consumer.xpath_protobuf]]
# metric_selection = "metrics[not(template_value)]"
# metric_name = "concat('edge', substring-after(name, ' '))"
# timestamp = "timestamp"
# timestamp_format = "unix_ms"
# [inputs.mqtt_consumer.xpath_protobuf.tags]
# name = "substring-after(name, ' ')"
# [inputs.mqtt_consumer.xpath_protobuf.fields_int]
# type = "datatype"
# alias = "alias"
# [inputs.mqtt_consumer.xpath_protobuf.fields]
# # A metric value must be numeric
# value = "number((int_value | long_value | float_value | double_value | boolean_value))"
# name = "name"
#
# # Starlark processor
# [[processors.starlark]]
# script = "sparkplug.star"
#
# # Optionally Define constants used in sparkplug.star
# # Constants can be defined here or they can be defined in the
# # sparkplug_b.star file.
#
# [processors.starlark.constants]
#
# # NOTE: The remaining fields can be specified either here or in the starlark script.
#
# # Tags used to identify message type - 3rd field of topic
# BIRTH_TAG = "BIRTH/"
# DEATH_TAG = "DEATH/"
# DATA_TAG = "DATA/"
#
# # Number of messages to hold if alias cannot be resolved
# MAX_UNRESOLVED = 3
#
# # Provide alternate names for the 5 sparkplug topic fields.
# # The topic contains 5 fields separated by the '/' character.
# # Define the tag name for each of these fields.
# MSG_FORMAT = "false" #0
# GROUP_ID = "reactor" #1
# MSG_TYPE = "false" #2
# EDGE_ID = "edgeid" #3
# DEVICE_ID = "experiment" #4
#
BIRTH_TAG = "BIRTH/"
DEATH_TAG = "DEATH/"
DATA_TAG = "DATA/"
# Number of messages to hold if alias cannot be resolved
MAX_UNRESOLVED = 3
# Provide alternate names for the 5 sparkplug topic fields.
# The topic contains 5 fields separated by the '/' character.
# Define the tag name for each of these fields.
MSG_FORMAT = "false" #0
GROUP_ID = "Reactor" #1
MSG_TYPE = "MsgType" #2
EDGE_ID = "EdgeID" #3
DEVICE_ID = "Experiment" #4
########### Begin sparkplug.star script
load("logging.star", "log")
state = {
"aliases": dict(),
"devices": dict(),
"unresolved": list()
}
def extractTopicTags(metric):
msg_format = ''
groupid = ''
msg_type = ''
edgeid = ''
deviceid = ''
topic = metric.tags.get("topic", "");
fields = topic.split("/");
nfields = len(fields)
if nfields > 0: msg_format = fields[0]
if nfields > 1: groupid = fields[1]
if nfields > 2: msg_type = fields[2]
if nfields > 3: edgeid = fields[3]
if nfields > 4: deviceid = fields[4]
return [msg_format, groupid, msg_type, edgeid, deviceid]
def buildTopicTags(metric, topicFields):
# Remove topic and host tags - they are not useful for analysis
metric.tags.pop("topic")
metric.tags.pop("host")
if MSG_FORMAT != "false": metric.tags[MSG_FORMAT] = topicFields[0]
if GROUP_ID != "false": metric.tags[GROUP_ID] = topicFields[1]
if MSG_TYPE != "false": metric.tags[MSG_TYPE] = topicFields[2]
if EDGE_ID != "false": metric.tags[EDGE_ID] = topicFields[3]
if DEVICE_ID != "false": metric.tags[DEVICE_ID] = topicFields[4]
def buildNameTags(metric,name):
# Remove type and alias from metric.fields - They are not useful for analysis
metric.fields.pop("type")
metric.fields.pop("alias")
if "name" in metric.fields:
metric.fields.pop("name")
# The Groov EPIC metric names are comprised of 3 fields separated by a '/'
# source, datatype, and metric name
# Extract these fields and include them as tags.
fields = name.split('/')
nfields = len(fields)
if nfields > 0:
metric.tags["Source"] = fields[0]
if nfields > 1:
metric.tags["Datatype"] = fields[1]
if nfields > 2:
metric.tags["Metric"] = fields[2]
# OPTIONAL
#
# By using underscore characters the metric name can be further
# divided into additional tags.
# How this is defined is site specific.
# Customize this as you wish
# The following demonstrates dividing the metric name into 3, 4 or 5 new tags
# A metric name must have between 3-5 underscore separated fields
# If there is only one or two fields then the only tag created is 'metric'
# which has the full name
#
# The last field is Units and is filled before fields 3, 4 and 5
# Ex: C, V, Torr, W, psi, RPM, On....
# The units are used in Influx as the 'measurement' name.
#
#
# Fields 3, 4 and 5 (device, position, composition) are optional
# measurement_component_device_position_composition_units
#
# Ex: I_FuelTank1_C (2 fields)
# Measurement I
# Component FuelTank1
# Units C
#
# I_FuelTank1_TC_Outlet_C (5 fields)
# Measurement I
# Component FuelTank1
# Device TC
# Position Outlet
# Units C
#
# I_FuelTank1_TC_Outlet_Premium_C (6 fields)
# Measurement I
# Component FuelTank1
# Device TC
# Position Outlet
# Composition Premium
# Units C
# Split the metric name into fields using '_'
sfields = fields[2].split('_')
nf = len(sfields)
# Don't split the name if it's one or two fields
if nf <= 2:
metric.name = "Name"
if nf > 2:
metric.name = sfields[nf-1] # The Units are used for the metric name
metric.tags["Component"] = sfields[1]
if nf > 3:
metric.tags["Device"] = sfields[2]
if nf > 4:
metric.tags["Position"] = sfields[3]
if nf > 5:
metric.tags["Composition"] = sfields[4]
def apply(metric):
output = metric
log.debug("apply metric: {}".format(metric))
topic = metric.tags.get("topic", "")
topicFields = extractTopicTags(metric)
edgeid = topicFields[3] # Sparkplug spec specifies 4th field as edgeid
# Split the topic into fields and assign to variables
# Determine if the message is of type birth and if so add it to the "devices" LUT.
if DEATH_TAG in topic:
output = None
elif BIRTH_TAG in topic:
log.debug(" metric msg_type: {} edgeid: {} topic: {}".format(BIRTH_TAG, edgeid, topic))
if "alias" in metric.fields and "name" in metric.fields:
# Create the lookup-table using "${edgeid}/${alias}" as the key and "${name}" as value
alias = metric.fields.get("alias")
name = metric.fields.get("name")
id = "{}/{}".format(edgeid,alias)
log.debug(" --> setting alias: {} name: {} id: {}'".format(alias, name, id))
state["aliases"][id] = name
if "value" in metric.fields:
buildTopicTags(metric, topicFields)
buildNameTags(metric, name)
else:
output = None
# Try to resolve the unresolved if any
if len(state["unresolved"]) > 0:
# Filter out the matching metrics and keep the rest as unresolved
log.debug(" unresolved")
unresolved = [("{}/{}".format(edgeid, m.fields["alias"]), m) for m in state["unresolved"]]
matching = [(mid, m) for mid, m in unresolved if mid == id]
state["unresolved"] = [m for mid, m in unresolved if mid != id]
log.debug(" found {} matching unresolved metrics".format(len(matching)))
# Process the matching metrics and output - TODO - needs debugging
# for mid, m in matching:
# buildTopicTags(m,topicFields)
# buildNameTags(m)
# output = [m for _, m in matching] + [metric]
elif DATA_TAG in topic:
log.debug(" metric msg_type: {} edgeid: {} topic: {}".format(DATA_TAG, edgeid, topic))
if "alias" in metric.fields:
alias = metric.fields.get("alias")
# Lookup the ID. If we know it, replace the name of the metric with the lookup value,
# otherwise we need to keep the metric for resolving later.
# This can happen if the messages are out-of-order for some reason...
id = "{}/{}".format(edgeid,alias)
if id in state["aliases"]:
name = state["aliases"][id]
log.debug(" found alias: {} name: {}".format(alias, name))
buildTopicTags(metric,topicFields)
buildNameTags(metric,name)
else:
# We want to hold the metric until we get the corresponding birth message
log.debug(" id not found: {}".format(id))
output = None
if len(state["unresolved"]) >= MAX_UNRESOLVED:
log.warn(" metric overflow, trimming {}".format(len(state["unresolved"]) - MAX_UNRESOLVED+1))
# Release the unresolved metrics as raw and trim buffer
output = state["unresolved"][MAX_UNRESOLVED-1:]
state["unresolved"] = state["unresolved"][:MAX_UNRESOLVED-1]
log.debug(" --> keeping metric")
state["unresolved"].append(metric)
else:
output = None
return output

View file

@ -0,0 +1,19 @@
# Example of parsing a date out of a field and modifying the metric to inject the year, month and day.
#
# Example Input:
# time value="2009-06-12T12:06:10.000000099" 1465839830100400201
#
# Example Output:
# time year=2009i,month=6i,day=12i 1465839830100400201
load('time.star', 'time')
# loads time.parse_duration(), time.is_valid_timezone(), time.now(), time.time(),
# time.parse_time() and time.from_timestamp()
def apply(metric):
date = time.parse_time(metric.fields.get('value'), format="2006-01-02T15:04:05.999999999", location="UTC")
metric.fields.pop('value')
metric.fields["year"] = date.year
metric.fields["month"] = date.month
metric.fields["day"] = date.day
return metric

View file

@ -0,0 +1,17 @@
# Example of parsing a duration out of a field and modifying the metric to inject the equivalent in seconds.
#
# Example Input:
# time value="3m35s" 1465839830100400201
#
# Example Output:
# time seconds=215 1465839830100400201
load('time.star', 'time')
# loads time.parse_duration(), time.is_valid_timezone(), time.now(), time.time(),
# time.parse_time() and time.from_timestamp()
def apply(metric):
duration = time.parse_duration(metric.fields.get('value'))
metric.fields.pop('value')
metric.fields["seconds"] = duration.seconds
return metric

View file

@ -0,0 +1,15 @@
# Example of setting the metric timestamp to the current time.
#
# Example Input:
# time result="OK" 1515581000000000000
#
# Example Output:
# time result="OK" 1618488000000000999
load('time.star', 'time')
def apply(metric):
# You can set the timestamp by using the current time.
metric.time = time.now().unix_nano
return metric

View file

@ -0,0 +1,22 @@
# Example of filtering metrics based on the timestamp in seconds.
#
# Example Input:
# time result="KO" 1616020365100400201
# time result="OK" 1616150517100400201
#
# Example Output:
# time result="OK" 1616150517100400201
load('time.star', 'time')
# loads time.parse_duration(), time.is_valid_timezone(), time.now(), time.time(),
# time.parse_time() and time.from_timestamp()
def apply(metric):
# 1616198400 sec = Saturday, March 20, 2021 0:00:00 GMT
refDate = time.from_timestamp(1616198400)
# 1616020365 sec = Wednesday, March 17, 2021 22:32:45 GMT
# 1616150517 sec = Friday, March 19, 2021 10:41:57 GMT
metric_date = time.from_timestamp(int(metric.time / 1e9))
# Only keep metrics with a timestamp that is not more than 24 hours before the reference date
if refDate - time.parse_duration("24h") < metric_date:
return metric

View file

@ -0,0 +1,22 @@
# Example of filtering metrics based on the timestamp with nanoseconds.
#
# Example Input:
# time result="KO" 1617900602123455999
# time result="OK" 1617900602123456789
#
# Example Output:
# time result="OK" 1617900602123456789
load('time.star', 'time')
# loads time.parse_duration(), time.is_valid_timezone(), time.now(), time.time(),
# time.parse_time() and time.from_timestamp()
def apply(metric):
# 1617900602123457000 nanosec = Thursday, April 8, 2021 16:50:02.123457000 GMT
refDate = time.from_timestamp(1617900602, 123457000)
# 1617900602123455999 nanosec = Thursday, April 8, 2021 16:50:02.123455999 GMT
# 1617900602123456789 nanosec = Thursday, April 8, 2021 16:50:02.123456789 GMT
metric_date = time.from_timestamp(int(metric.time / 1e9), int(metric.time % 1e9))
# Only keep metrics with a timestamp that is not more than 1 microsecond before the reference date
if refDate - time.parse_duration("1us") < metric_date:
return metric

View file

@ -0,0 +1,18 @@
# Filter metrics by value
'''
In this example we look at the `value` field of the metric.
If the value is zero, we delete all the fields, effectively dropping the metric.
Example Input:
temperature sensor="001A0",value=111.48 1618488000000000999
temperature sensor="001B0",value=0.0 1618488000000000999
Example Output:
temperature sensor="001A0",value=111.48 1618488000000000999
'''
def apply(metric):
if metric.fields["value"] == 0.0:
# removing all fields deletes a metric
metric.fields.clear()
return metric