1
0
Fork 0

Adding upstream version 1.34.4.

Signed-off-by: Daniel Baumann <daniel@debian.org>
This commit is contained in:
Daniel Baumann 2025-05-24 07:26:29 +02:00
parent e393c3af3f
commit 4978089aab
Signed by: daniel
GPG key ID: FBB4F0E80A80222F
4963 changed files with 677545 additions and 0 deletions

View file

@ -0,0 +1,126 @@
# AWS EC2 Metadata Processor Plugin
AWS EC2 Metadata processor plugin appends metadata gathered from [AWS IMDS][]
to metrics associated with EC2 instances.
[AWS IMDS]: https://docs.aws.amazon.com/AWSEC2/latest/UserGuide/ec2-instance-metadata.html
## Global configuration options <!-- @/docs/includes/plugin_config.md -->
In addition to the plugin-specific configuration settings, plugins support
additional global and plugin configuration settings. These settings are used to
modify metrics, tags, and field or create aliases and configure ordering, etc.
See the [CONFIGURATION.md][CONFIGURATION.md] for more details.
[CONFIGURATION.md]: ../../../docs/CONFIGURATION.md#plugins
## Configuration
```toml @sample.conf
# Attach AWS EC2 metadata to metrics
[[processors.aws_ec2]]
## Instance identity document tags to attach to metrics.
## For more information see:
## https://docs.aws.amazon.com/AWSEC2/latest/UserGuide/instance-identity-documents.html
##
## Available tags:
## * accountId
## * architecture
## * availabilityZone
## * billingProducts
## * imageId
## * instanceId
## * instanceType
## * kernelId
## * pendingTime
## * privateIp
## * ramdiskId
## * region
## * version
# imds_tags = []
## EC2 instance tags retrieved with DescribeTags action.
## In case tag is empty upon retrieval it's omitted when tagging metrics.
## Note that in order for this to work, role attached to EC2 instance or AWS
## credentials available from the environment must have a policy attached, that
## allows ec2:DescribeTags.
##
## For more information see:
## https://docs.aws.amazon.com/AWSEC2/latest/APIReference/API_DescribeTags.html
# ec2_tags = []
## Paths to instance metadata information to attach to the metrics.
## Specify the full path without the base-path e.g. `tags/instance/Name`.
##
## For more information see:
## https://docs.aws.amazon.com/AWSEC2/latest/UserGuide/ec2-instance-metadata.html
# metadata_paths = []
## Allows to convert metadata tag-names to canonical names representing the
## full path with slashes ('/') being replaces with underscores. By default,
## only the last path element is used to name the tag.
# canonical_metadata_tags = false
## Timeout for http requests made by against aws ec2 metadata endpoint.
# timeout = "10s"
## ordered controls whether or not the metrics need to stay in the same order
## this plugin received them in. If false, this plugin will change the order
## with requests hitting cached results moving through immediately and not
## waiting on slower lookups. This may cause issues for you if you are
## depending on the order of metrics staying the same. If so, set this to true.
## Keeping the metrics ordered may be slightly slower.
# ordered = false
## max_parallel_calls is the maximum number of AWS API calls to be in flight
## at the same time.
## It's probably best to keep this number fairly low.
# max_parallel_calls = 10
## cache_ttl determines how long each cached item will remain in the cache before
## it is removed and subsequently needs to be queried for from the AWS API. By
## default, no items are cached.
# cache_ttl = "0s"
## tag_cache_size determines how many of the values which are found in imds_tags
## or ec2_tags will be kept in memory for faster lookup on successive processing
## of metrics. You may want to adjust this if you have excessively large numbers
## of tags on your EC2 instances, and you are using the ec2_tags field. This
## typically does not need to be changed when using the imds_tags field.
# tag_cache_size = 1000
## log_cache_stats will emit a log line periodically to stdout with details of
## cache entries, hits, misses, and evacuations since the last time stats were
## emitted. This can be helpful in determining whether caching is being effective
## in your environment. Stats are emitted every 30 seconds. By default, this
## setting is disabled.
# log_cache_stats = false
```
## Example
Append `accountId` and `instanceId` to metrics tags:
```toml
[[processors.aws_ec2]]
tags = [ "accountId", "instanceId"]
```
```diff
- cpu,hostname=localhost time_idle=42
+ cpu,hostname=localhost,accountId=123456789,instanceId=i-123456789123 time_idle=42
```
## Notes
We use a single cache because telegraf's `AddTag` function models this.
A user can specify a list of both EC2 tags and IMDS tags. The items in this list
can, technically, be the same. This will result in a situation where the EC2
Tag's value will override the IMDS tags value.
Though this is undesirable, it is unavoidable because the `AddTag` function does
not support this case.
You should avoid using IMDS tags as EC2 tags because the EC2 tags will always
"win" due to them being processed in this plugin *after* IMDS tags.

View file

@ -0,0 +1,401 @@
//go:generate ../../../tools/readme_config_includer/generator
package aws_ec2
import (
"context"
_ "embed"
"errors"
"fmt"
"io"
"slices"
"strings"
"time"
"github.com/aws/aws-sdk-go-v2/aws"
awsconfig "github.com/aws/aws-sdk-go-v2/config"
"github.com/aws/aws-sdk-go-v2/feature/ec2/imds"
"github.com/aws/aws-sdk-go-v2/service/ec2"
"github.com/aws/aws-sdk-go-v2/service/ec2/types"
"github.com/aws/smithy-go"
"github.com/coocood/freecache"
"github.com/influxdata/telegraf"
"github.com/influxdata/telegraf/config"
"github.com/influxdata/telegraf/plugins/common/parallel"
"github.com/influxdata/telegraf/plugins/processors"
)
//go:embed sample.conf
var sampleConfig string
type AwsEc2Processor struct {
ImdsTags []string `toml:"imds_tags"`
EC2Tags []string `toml:"ec2_tags"`
MetadataPaths []string `toml:"metadata_paths"`
CanonicalMetadataTags bool `toml:"canonical_metadata_tags"`
Timeout config.Duration `toml:"timeout"`
CacheTTL config.Duration `toml:"cache_ttl"`
Ordered bool `toml:"ordered"`
MaxParallelCalls int `toml:"max_parallel_calls"`
TagCacheSize int `toml:"tag_cache_size"`
LogCacheStats bool `toml:"log_cache_stats"`
Log telegraf.Logger `toml:"-"`
tagCache *freecache.Cache
imdsClient *imds.Client
ec2Client *ec2.Client
parallel parallel.Parallel
instanceID string
cancelCleanupWorker context.CancelFunc
}
const (
DefaultMaxOrderedQueueSize = 10_000
DefaultMaxParallelCalls = 10
DefaultTimeout = 10 * time.Second
DefaultCacheTTL = 0 * time.Hour
DefaultCacheSize = 1000
DefaultLogCacheStats = false
)
var allowedImdsTags = []string{
"accountId",
"architecture",
"availabilityZone",
"billingProducts",
"imageId",
"instanceId",
"instanceType",
"kernelId",
"pendingTime",
"privateIp",
"ramdiskId",
"region",
"version",
}
func (*AwsEc2Processor) SampleConfig() string {
return sampleConfig
}
func (r *AwsEc2Processor) Add(metric telegraf.Metric, _ telegraf.Accumulator) error {
r.parallel.Enqueue(metric)
return nil
}
func (r *AwsEc2Processor) Init() error {
r.Log.Debug("Initializing AWS EC2 Processor")
if len(r.ImdsTags) == 0 && len(r.MetadataPaths) == 0 && len(r.EC2Tags) == 0 {
return errors.New("no tags specified in configuration")
}
for _, tag := range r.ImdsTags {
if tag == "" || !slices.Contains(allowedImdsTags, tag) {
return fmt.Errorf("invalid imds tag %q", tag)
}
}
return nil
}
func (r *AwsEc2Processor) Start(acc telegraf.Accumulator) error {
r.tagCache = freecache.NewCache(r.TagCacheSize)
if r.LogCacheStats {
ctx, cancel := context.WithCancel(context.Background())
r.cancelCleanupWorker = cancel
go r.logCacheStatistics(ctx)
}
r.Log.Debugf("cache: size=%d\n", r.TagCacheSize)
if r.CacheTTL > 0 {
r.Log.Debugf("cache timeout: seconds=%d\n", int(time.Duration(r.CacheTTL).Seconds()))
}
ctx := context.Background()
cfg, err := awsconfig.LoadDefaultConfig(ctx)
if err != nil {
return fmt.Errorf("failed loading default AWS config: %w", err)
}
r.imdsClient = imds.NewFromConfig(cfg)
iido, err := r.imdsClient.GetInstanceIdentityDocument(
ctx,
&imds.GetInstanceIdentityDocumentInput{},
)
if err != nil {
return fmt.Errorf("failed getting instance identity document: %w", err)
}
r.instanceID = iido.InstanceID
if len(r.EC2Tags) > 0 {
// Add region to AWS config when creating EC2 service client since it's required.
cfg.Region = iido.Region
r.ec2Client = ec2.NewFromConfig(cfg)
// Check if instance is allowed to call DescribeTags.
_, err = r.ec2Client.DescribeTags(ctx, &ec2.DescribeTagsInput{
DryRun: aws.Bool(true),
})
var ae smithy.APIError
if errors.As(err, &ae) {
if ae.ErrorCode() != "DryRunOperation" {
return fmt.Errorf("instance doesn't have permissions to call DescribeTags: %w", err)
}
} else if err != nil {
return fmt.Errorf("error calling DescribeTags: %w", err)
}
}
if r.Ordered {
r.parallel = parallel.NewOrdered(acc, r.asyncAdd, DefaultMaxOrderedQueueSize, r.MaxParallelCalls)
} else {
r.parallel = parallel.NewUnordered(acc, r.asyncAdd, r.MaxParallelCalls)
}
return nil
}
func (r *AwsEc2Processor) Stop() {
if r.parallel != nil {
r.parallel.Stop()
}
if r.cancelCleanupWorker != nil {
r.cancelCleanupWorker()
r.cancelCleanupWorker = nil
}
}
func (r *AwsEc2Processor) logCacheStatistics(ctx context.Context) {
if r.tagCache == nil {
return
}
ticker := time.NewTicker(30 * time.Second)
for {
select {
case <-ctx.Done():
return
case <-ticker.C:
r.Log.Debugf("cache: size=%d hit=%d miss=%d full=%d\n",
r.tagCache.EntryCount(),
r.tagCache.HitCount(),
r.tagCache.MissCount(),
r.tagCache.EvacuateCount(),
)
r.tagCache.ResetStatistics()
}
}
}
func (r *AwsEc2Processor) lookupIMDSTags(metric telegraf.Metric) telegraf.Metric {
ctx, cancel := context.WithTimeout(context.Background(), time.Duration(r.Timeout))
defer cancel()
var tagsNotFound []string
for _, tag := range r.ImdsTags {
val, err := r.tagCache.Get([]byte(tag))
if err != nil {
tagsNotFound = append(tagsNotFound, tag)
} else {
metric.AddTag(tag, string(val))
}
}
if len(tagsNotFound) == 0 {
return metric
}
doc, err := r.imdsClient.GetInstanceIdentityDocument(ctx, &imds.GetInstanceIdentityDocumentInput{})
if err != nil {
r.Log.Errorf("Error when calling GetInstanceIdentityDocument: %v", err)
return metric
}
for _, tag := range tagsNotFound {
var v string
switch tag {
case "accountId":
v = doc.AccountID
case "architecture":
v = doc.Architecture
case "availabilityZone":
v = doc.AvailabilityZone
case "billingProducts":
v = strings.Join(doc.BillingProducts, ",")
case "imageId":
v = doc.ImageID
case "instanceId":
v = doc.InstanceID
case "instanceType":
v = doc.InstanceType
case "kernelId":
v = doc.KernelID
case "pendingTime":
v = doc.PendingTime.String()
case "privateIp":
v = doc.PrivateIP
case "ramdiskId":
v = doc.RamdiskID
case "region":
v = doc.Region
case "version":
v = doc.Version
default:
continue
}
metric.AddTag(tag, v)
expiration := int(time.Duration(r.CacheTTL).Seconds())
if err := r.tagCache.Set([]byte(tag), []byte(v), expiration); err != nil {
r.Log.Errorf("Error when setting IMDS tag cache value: %v", err)
continue
}
}
return metric
}
func (r *AwsEc2Processor) lookupMetadata(metric telegraf.Metric) telegraf.Metric {
ctx, cancel := context.WithTimeout(context.Background(), time.Duration(r.Timeout))
defer cancel()
for _, path := range r.MetadataPaths {
key := strings.Trim(path, "/ ")
if r.CanonicalMetadataTags {
key = strings.ReplaceAll(key, "/", "_")
} else {
if idx := strings.LastIndex(key, "/"); idx > 0 {
key = key[idx+1:]
}
}
// Try to lookup the tag in cache
if value, err := r.tagCache.Get([]byte("metadata/" + path)); err == nil {
metric.AddTag(key, string(value))
continue
}
// Query the tag with the full path
resp, err := r.imdsClient.GetMetadata(ctx, &imds.GetMetadataInput{Path: path})
if err != nil {
r.Log.Errorf("Getting metadata %q failed: %v", path, err)
continue
}
value, err := io.ReadAll(resp.Content)
if err != nil {
r.Log.Errorf("Reading metadata reponse for %+v failed: %v", path, err)
continue
}
if len(value) > 0 {
metric.AddTag(key, string(value))
}
expiration := int(time.Duration(r.CacheTTL).Seconds())
if err = r.tagCache.Set([]byte("metadata/"+path), value, expiration); err != nil {
r.Log.Errorf("Updating metadata cache for %q failed: %v", path, err)
continue
}
}
return metric
}
func (r *AwsEc2Processor) lookupEC2Tags(metric telegraf.Metric) telegraf.Metric {
ctx, cancel := context.WithTimeout(context.Background(), time.Duration(r.Timeout))
defer cancel()
var tagsNotFound []string
for _, tag := range r.EC2Tags {
val, err := r.tagCache.Get([]byte(tag))
if err != nil {
tagsNotFound = append(tagsNotFound, tag)
} else {
metric.AddTag(tag, string(val))
}
}
if len(tagsNotFound) == 0 {
return metric
}
dto, err := r.ec2Client.DescribeTags(ctx, &ec2.DescribeTagsInput{
Filters: []types.Filter{
{
Name: aws.String("resource-id"),
Values: []string{r.instanceID},
},
{
Name: aws.String("key"),
Values: r.EC2Tags,
},
},
})
if err != nil {
r.Log.Errorf("Error during EC2 DescribeTags: %v", err)
return metric
}
for _, tag := range r.EC2Tags {
if v := getTagFromDescribeTags(dto, tag); v != "" {
metric.AddTag(tag, v)
expiration := int(time.Duration(r.CacheTTL).Seconds())
err = r.tagCache.Set([]byte(tag), []byte(v), expiration)
if err != nil {
r.Log.Errorf("Error when setting EC2Tags tag cache value: %v", err)
}
}
}
return metric
}
func (r *AwsEc2Processor) asyncAdd(metric telegraf.Metric) []telegraf.Metric {
// Add IMDS Instance Identity Document tags.
if len(r.ImdsTags) > 0 {
metric = r.lookupIMDSTags(metric)
}
// Add instance metadata tags.
if len(r.MetadataPaths) > 0 {
metric = r.lookupMetadata(metric)
}
// Add EC2 instance tags.
if len(r.EC2Tags) > 0 {
metric = r.lookupEC2Tags(metric)
}
return []telegraf.Metric{metric}
}
func init() {
processors.AddStreaming("aws_ec2", func() telegraf.StreamingProcessor {
return newAwsEc2Processor()
})
}
func newAwsEc2Processor() *AwsEc2Processor {
return &AwsEc2Processor{
MaxParallelCalls: DefaultMaxParallelCalls,
TagCacheSize: DefaultCacheSize,
Timeout: config.Duration(DefaultTimeout),
CacheTTL: config.Duration(DefaultCacheTTL),
}
}
func getTagFromDescribeTags(o *ec2.DescribeTagsOutput, tag string) string {
for _, t := range o.Tags {
if *t.Key == tag {
return *t.Value
}
}
return ""
}

View file

@ -0,0 +1,184 @@
package aws_ec2
import (
"sync"
"testing"
"time"
"github.com/coocood/freecache"
"github.com/stretchr/testify/require"
"github.com/influxdata/telegraf"
"github.com/influxdata/telegraf/config"
"github.com/influxdata/telegraf/metric"
"github.com/influxdata/telegraf/plugins/common/parallel"
"github.com/influxdata/telegraf/testutil"
)
func TestBasicStartup(t *testing.T) {
p := newAwsEc2Processor()
p.Log = &testutil.Logger{}
p.ImdsTags = []string{"accountId", "instanceId"}
acc := &testutil.Accumulator{}
require.NoError(t, p.Init())
require.Empty(t, acc.GetTelegrafMetrics())
require.Empty(t, acc.Errors)
}
func TestBasicStartupWithEC2Tags(t *testing.T) {
p := newAwsEc2Processor()
p.Log = &testutil.Logger{}
p.ImdsTags = []string{"accountId", "instanceId"}
p.EC2Tags = []string{"Name"}
acc := &testutil.Accumulator{}
require.NoError(t, p.Init())
require.Empty(t, acc.GetTelegrafMetrics())
require.Empty(t, acc.Errors)
}
func TestBasicStartupWithCacheTTL(t *testing.T) {
p := newAwsEc2Processor()
p.Log = &testutil.Logger{}
p.ImdsTags = []string{"accountId", "instanceId"}
p.CacheTTL = config.Duration(12 * time.Hour)
acc := &testutil.Accumulator{}
require.NoError(t, p.Init())
require.Empty(t, acc.GetTelegrafMetrics())
require.Empty(t, acc.Errors)
}
func TestBasicStartupWithTagCacheSize(t *testing.T) {
p := newAwsEc2Processor()
p.Log = &testutil.Logger{}
p.ImdsTags = []string{"accountId", "instanceId"}
p.TagCacheSize = 100
acc := &testutil.Accumulator{}
require.NoError(t, p.Init())
require.Empty(t, acc.GetTelegrafMetrics())
require.Empty(t, acc.Errors)
}
func TestBasicInitNoTagsReturnAnError(t *testing.T) {
p := newAwsEc2Processor()
p.Log = &testutil.Logger{}
err := p.Init()
require.Error(t, err)
}
func TestBasicInitInvalidTagsReturnAnError(t *testing.T) {
p := newAwsEc2Processor()
p.Log = &testutil.Logger{}
p.ImdsTags = []string{"dummy", "qwerty"}
err := p.Init()
require.Error(t, err)
}
func TestTracking(t *testing.T) {
// Setup raw input and expected output
inputRaw := []telegraf.Metric{
metric.New(
"m1",
map[string]string{
"metric_tag": "from_metric",
},
map[string]interface{}{"value": int64(1)},
time.Unix(0, 0),
),
metric.New(
"m2",
map[string]string{
"metric_tag": "foo_metric",
},
map[string]interface{}{"value": int64(2)},
time.Unix(0, 0),
),
}
expected := []telegraf.Metric{
metric.New(
"m1",
map[string]string{
"metric_tag": "from_metric",
"accountId": "123456789",
"instanceId": "i-123456789123",
},
map[string]interface{}{"value": int64(1)},
time.Unix(0, 0),
),
metric.New(
"m2",
map[string]string{
"metric_tag": "foo_metric",
"accountId": "123456789",
"instanceId": "i-123456789123",
},
map[string]interface{}{"value": int64(2)},
time.Unix(0, 0),
),
}
// Create fake notification for testing
var mu sync.Mutex
delivered := make([]telegraf.DeliveryInfo, 0, len(inputRaw))
notify := func(di telegraf.DeliveryInfo) {
mu.Lock()
defer mu.Unlock()
delivered = append(delivered, di)
}
// Convert raw input to tracking metric
input := make([]telegraf.Metric, 0, len(inputRaw))
for _, m := range inputRaw {
tm, _ := metric.WithTracking(m, notify)
input = append(input, tm)
}
// Prepare and start the plugin
plugin := &AwsEc2Processor{
MaxParallelCalls: DefaultMaxParallelCalls,
TagCacheSize: DefaultCacheSize,
Timeout: config.Duration(DefaultTimeout),
CacheTTL: config.Duration(DefaultCacheTTL),
ImdsTags: []string{"accountId", "instanceId"},
Log: &testutil.Logger{},
}
require.NoError(t, plugin.Init())
// Instead of starting the plugin which tries to connect to the remote
// service, we just fill the cache and start the minimum mechanics to
// process the metrics.
plugin.tagCache = freecache.NewCache(DefaultCacheSize)
require.NoError(t, plugin.tagCache.Set([]byte("accountId"), []byte("123456789"), -1))
require.NoError(t, plugin.tagCache.Set([]byte("instanceId"), []byte("i-123456789123"), -1))
var acc testutil.Accumulator
plugin.parallel = parallel.NewOrdered(&acc, plugin.asyncAdd, plugin.TagCacheSize, plugin.MaxParallelCalls)
// Schedule the metrics and wait until they are ready to perform the
// comparison
for _, in := range input {
require.NoError(t, plugin.Add(in, &acc))
}
require.Eventually(t, func() bool {
return int(acc.NMetrics()) >= len(expected)
}, 3*time.Second, 100*time.Millisecond)
actual := acc.GetTelegrafMetrics()
testutil.RequireMetricsEqual(t, expected, actual)
// Simulate output acknowledging delivery
for _, m := range actual {
m.Accept()
}
// Check delivery
require.Eventuallyf(t, func() bool {
mu.Lock()
defer mu.Unlock()
return len(input) == len(delivered)
}, time.Second, 100*time.Millisecond, "%d delivered but %d expected", len(delivered), len(expected))
}

View file

@ -0,0 +1,78 @@
# Attach AWS EC2 metadata to metrics
[[processors.aws_ec2]]
## Instance identity document tags to attach to metrics.
## For more information see:
## https://docs.aws.amazon.com/AWSEC2/latest/UserGuide/instance-identity-documents.html
##
## Available tags:
## * accountId
## * architecture
## * availabilityZone
## * billingProducts
## * imageId
## * instanceId
## * instanceType
## * kernelId
## * pendingTime
## * privateIp
## * ramdiskId
## * region
## * version
# imds_tags = []
## EC2 instance tags retrieved with DescribeTags action.
## In case tag is empty upon retrieval it's omitted when tagging metrics.
## Note that in order for this to work, role attached to EC2 instance or AWS
## credentials available from the environment must have a policy attached, that
## allows ec2:DescribeTags.
##
## For more information see:
## https://docs.aws.amazon.com/AWSEC2/latest/APIReference/API_DescribeTags.html
# ec2_tags = []
## Paths to instance metadata information to attach to the metrics.
## Specify the full path without the base-path e.g. `tags/instance/Name`.
##
## For more information see:
## https://docs.aws.amazon.com/AWSEC2/latest/UserGuide/ec2-instance-metadata.html
# metadata_paths = []
## Allows to convert metadata tag-names to canonical names representing the
## full path with slashes ('/') being replaces with underscores. By default,
## only the last path element is used to name the tag.
# canonical_metadata_tags = false
## Timeout for http requests made by against aws ec2 metadata endpoint.
# timeout = "10s"
## ordered controls whether or not the metrics need to stay in the same order
## this plugin received them in. If false, this plugin will change the order
## with requests hitting cached results moving through immediately and not
## waiting on slower lookups. This may cause issues for you if you are
## depending on the order of metrics staying the same. If so, set this to true.
## Keeping the metrics ordered may be slightly slower.
# ordered = false
## max_parallel_calls is the maximum number of AWS API calls to be in flight
## at the same time.
## It's probably best to keep this number fairly low.
# max_parallel_calls = 10
## cache_ttl determines how long each cached item will remain in the cache before
## it is removed and subsequently needs to be queried for from the AWS API. By
## default, no items are cached.
# cache_ttl = "0s"
## tag_cache_size determines how many of the values which are found in imds_tags
## or ec2_tags will be kept in memory for faster lookup on successive processing
## of metrics. You may want to adjust this if you have excessively large numbers
## of tags on your EC2 instances, and you are using the ec2_tags field. This
## typically does not need to be changed when using the imds_tags field.
# tag_cache_size = 1000
## log_cache_stats will emit a log line periodically to stdout with details of
## cache entries, hits, misses, and evacuations since the last time stats were
## emitted. This can be helpful in determining whether caching is being effective
## in your environment. Stats are emitted every 30 seconds. By default, this
## setting is disabled.
# log_cache_stats = false