Adding upstream version 1.34.4.
Signed-off-by: Daniel Baumann <daniel@debian.org>
This commit is contained in:
parent
e393c3af3f
commit
4978089aab
4963 changed files with 677545 additions and 0 deletions
206
plugins/outputs/azure_monitor/README.md
Normal file
206
plugins/outputs/azure_monitor/README.md
Normal file
|
@ -0,0 +1,206 @@
|
|||
# Azure Monitor Output Plugin
|
||||
|
||||
This plugin writes metrics to [Azure Monitor][azure_monitor] which has
|
||||
a metric resolution of one minute. To accomodate for this in Telegraf, the
|
||||
plugin will automatically aggregate metrics into one minute buckets and send
|
||||
them to the service on every flush interval.
|
||||
|
||||
> [!IMPORTANT]
|
||||
> The Azure Monitor custom metrics service is currently in preview and might
|
||||
> not be available in all Azure regions.
|
||||
> Please also take the [metric time limitations](#metric-time-limitations) into
|
||||
> account!
|
||||
|
||||
The metrics from each input plugin will be written to a separate Azure Monitor
|
||||
namespace, prefixed with `Telegraf/` by default. The field name for each metric
|
||||
is written as the Azure Monitor metric name. All field values are written as a
|
||||
summarized set that includes: min, max, sum, count. Tags are written as a
|
||||
dimension on each Azure Monitor metric.
|
||||
|
||||
⭐ Telegraf v1.8.0
|
||||
🏷️ cloud, datastore
|
||||
💻 all
|
||||
|
||||
[azure_monitor]: https://learn.microsoft.com/en-us/azure/azure-monitor
|
||||
|
||||
## Global configuration options <!-- @/docs/includes/plugin_config.md -->
|
||||
|
||||
In addition to the plugin-specific configuration settings, plugins support
|
||||
additional global and plugin configuration settings. These settings are used to
|
||||
modify metrics, tags, and field or create aliases and configure ordering, etc.
|
||||
See the [CONFIGURATION.md][CONFIGURATION.md] for more details.
|
||||
|
||||
[CONFIGURATION.md]: ../../../docs/CONFIGURATION.md#plugins
|
||||
|
||||
## Configuration
|
||||
|
||||
```toml @sample.conf
|
||||
# Send aggregate metrics to Azure Monitor
|
||||
[[outputs.azure_monitor]]
|
||||
## Timeout for HTTP writes.
|
||||
# timeout = "20s"
|
||||
|
||||
## Set the namespace prefix, defaults to "Telegraf/<input-name>".
|
||||
# namespace_prefix = "Telegraf/"
|
||||
|
||||
## Azure Monitor doesn't have a string value type, so convert string
|
||||
## fields to dimensions (a.k.a. tags) if enabled. Azure Monitor allows
|
||||
## a maximum of 10 dimensions so Telegraf will only send the first 10
|
||||
## alphanumeric dimensions.
|
||||
# strings_as_dimensions = false
|
||||
|
||||
## Both region and resource_id must be set or be available via the
|
||||
## Instance Metadata service on Azure Virtual Machines.
|
||||
#
|
||||
## Azure Region to publish metrics against.
|
||||
## ex: region = "southcentralus"
|
||||
# region = ""
|
||||
#
|
||||
## The Azure Resource ID against which metric will be logged, e.g.
|
||||
## ex: resource_id = "/subscriptions/<subscription_id>/resourceGroups/<resource_group>/providers/Microsoft.Compute/virtualMachines/<vm_name>"
|
||||
# resource_id = ""
|
||||
|
||||
## Optionally, if in Azure US Government, China, or other sovereign
|
||||
## cloud environment, set the appropriate REST endpoint for receiving
|
||||
## metrics. (Note: region may be unused in this context)
|
||||
# endpoint_url = "https://monitoring.core.usgovcloudapi.net"
|
||||
|
||||
## Time limitations of metric to send
|
||||
## Documentation can be found here:
|
||||
## https://learn.microsoft.com/en-us/azure/azure-monitor/essentials/metrics-store-custom-rest-api?tabs=rest#timestamp
|
||||
## However, the returned (400) error message might document more strict or
|
||||
## relaxed settings. By default, only past metrics witin the limit are sent.
|
||||
# timestamp_limit_past = "30m"
|
||||
# timestamp_limit_future = "-1m"
|
||||
```
|
||||
|
||||
## Setup
|
||||
|
||||
1. [Register the `microsoft.insights` resource provider in your Azure
|
||||
subscription][resource provider].
|
||||
1. If using Managed Service Identities to authenticate an Azure VM, [enable
|
||||
system-assigned managed identity][enable msi].
|
||||
1. Use a region that supports Azure Monitor Custom Metrics, For regions with
|
||||
Custom Metrics support, an endpoint will be available with the format
|
||||
`https://<region>.monitoring.azure.com`.
|
||||
|
||||
[resource provider]: https://docs.microsoft.com/en-us/azure/azure-resource-manager/resource-manager-supported-services
|
||||
|
||||
[enable msi]: https://docs.microsoft.com/en-us/azure/active-directory/managed-service-identity/qs-configure-portal-windows-vm
|
||||
|
||||
### Region and Resource ID
|
||||
|
||||
The plugin will attempt to discover the region and resource ID using the Azure
|
||||
VM Instance Metadata service. If Telegraf is not running on a virtual machine or
|
||||
the VM Instance Metadata service is not available, the following variables are
|
||||
required for the output to function.
|
||||
|
||||
* region
|
||||
* resource_id
|
||||
|
||||
### Authentication
|
||||
|
||||
This plugin uses one of several different types of authenticate methods. The
|
||||
preferred authentication methods are different from the *order* in which each
|
||||
authentication is checked. Here are the preferred authentication methods:
|
||||
|
||||
1. Managed Service Identity (MSI) token: This is the preferred authentication
|
||||
method. Telegraf will automatically authenticate using this method when
|
||||
running on Azure VMs.
|
||||
2. AAD Application Tokens (Service Principals)
|
||||
|
||||
* Primarily useful if Telegraf is writing metrics for other resources.
|
||||
[More information][principal].
|
||||
* A Service Principal or User Principal needs to be assigned the `Monitoring
|
||||
Metrics Publisher` role on the resource(s) metrics will be emitted
|
||||
against.
|
||||
|
||||
3. AAD User Tokens (User Principals)
|
||||
|
||||
* Allows Telegraf to authenticate like a user. It is best to use this method
|
||||
for development.
|
||||
|
||||
[principal]: https://docs.microsoft.com/en-us/azure/active-directory/develop/active-directory-application-objects
|
||||
|
||||
The plugin will authenticate using the first available of the following
|
||||
configurations:
|
||||
|
||||
1. **Client Credentials**: Azure AD Application ID and Secret. Set the following
|
||||
environment variables:
|
||||
|
||||
* `AZURE_TENANT_ID`: Specifies the Tenant to which to authenticate.
|
||||
* `AZURE_CLIENT_ID`: Specifies the app client ID to use.
|
||||
* `AZURE_CLIENT_SECRET`: Specifies the app secret to use.
|
||||
|
||||
1. **Client Certificate**: Azure AD Application ID and X.509 Certificate.
|
||||
|
||||
* `AZURE_TENANT_ID`: Specifies the Tenant to which to authenticate.
|
||||
* `AZURE_CLIENT_ID`: Specifies the app client ID to use.
|
||||
* `AZURE_CERTIFICATE_PATH`: Specifies the certificate Path to use.
|
||||
* `AZURE_CERTIFICATE_PASSWORD`: Specifies the certificate password to use.
|
||||
|
||||
1. **Resource Owner Password**: Azure AD User and Password. This grant type is
|
||||
*not recommended*, use device login instead if you need interactive login.
|
||||
|
||||
* `AZURE_TENANT_ID`: Specifies the Tenant to which to authenticate.
|
||||
* `AZURE_CLIENT_ID`: Specifies the app client ID to use.
|
||||
* `AZURE_USERNAME`: Specifies the username to use.
|
||||
* `AZURE_PASSWORD`: Specifies the password to use.
|
||||
|
||||
1. **Azure Managed Service Identity**: Delegate credential management to the
|
||||
platform. Requires that code is running in Azure, e.g. on a VM. All
|
||||
configuration is handled by Azure. See [Azure Managed Service Identity][msi]
|
||||
for more details. Only available when using the [Azure Resource
|
||||
Manager][arm].
|
||||
|
||||
[msi]: https://docs.microsoft.com/en-us/azure/active-directory/msi-overview
|
||||
[arm]: https://docs.microsoft.com/en-us/azure/azure-resource-manager/resource-group-overview
|
||||
|
||||
> [!NOTE]
|
||||
> As shown above, the last option (#4) is the preferred way to authenticate
|
||||
> when running Telegraf on Azure VMs.
|
||||
|
||||
## Dimensions
|
||||
|
||||
Azure Monitor only accepts values with a numeric type. The plugin will drop
|
||||
fields with a string type by default. The plugin can set all string type fields
|
||||
as extra dimensions in the Azure Monitor custom metric by setting the
|
||||
configuration option `strings_as_dimensions` to `true`.
|
||||
|
||||
Keep in mind, Azure Monitor allows a maximum of 10 dimensions per metric. The
|
||||
plugin will deterministically dropped any dimensions that exceed the 10
|
||||
dimension limit.
|
||||
|
||||
To convert only a subset of string-typed fields as dimensions, enable
|
||||
`strings_as_dimensions` and use the [`fieldinclude` or `fieldexclude`
|
||||
modifiers][conf-modifiers] to limit the string-typed fields that are sent to
|
||||
the plugin.
|
||||
|
||||
[conf-modifiers]: ../../../docs/CONFIGURATION.md#modifiers
|
||||
|
||||
## Metric time limitations
|
||||
|
||||
Azure Monitor won't accept metrics too far in the past or future. Keep this in
|
||||
mind when configuring your output buffer limits or other variables, such as
|
||||
flush intervals, or when using input sources that could cause metrics to be
|
||||
out of this allowed range.
|
||||
|
||||
According to the [documentation][timestamp_docs], the timestamp should not be
|
||||
older than 20 minutes or more than 5 minutes in the future at the time when the
|
||||
metric is sent to the Azure Monitor service. However, HTTP `400` error messages
|
||||
returned by the service might specify other values such as 30 minutes in the
|
||||
past and 4 minutes in the future.
|
||||
|
||||
You can control the timeframe actually sent using the `timestamp_limit_past` and
|
||||
`timestamp_limit_future` settings. By default only metrics between 30 minutes
|
||||
and up to one minute in the past are sent. The lower limit represents the more
|
||||
permissive limit received in the `400` error messages. The upper limit leaves
|
||||
enough time for aggregation to happen by not sending aggregations too early.
|
||||
|
||||
> [!IMPORTANT]
|
||||
> When adapting the limit you need to take the limits permitted by the service
|
||||
> as well as latency when sending metrics into account. Furthermore, you sould
|
||||
> not send metrics too early as in this case aggregation might not happen and
|
||||
> values are misleading.
|
||||
|
||||
[timestamp_docs]: https://learn.microsoft.com/en-us/azure/azure-monitor/essentials/metrics-store-custom-rest-api?tabs=rest#timestamp
|
597
plugins/outputs/azure_monitor/azure_monitor.go
Normal file
597
plugins/outputs/azure_monitor/azure_monitor.go
Normal file
|
@ -0,0 +1,597 @@
|
|||
//go:generate ../../../tools/readme_config_includer/generator
|
||||
package azure_monitor
|
||||
|
||||
import (
|
||||
"bytes"
|
||||
"compress/gzip"
|
||||
"context"
|
||||
_ "embed"
|
||||
"encoding/binary"
|
||||
"encoding/json"
|
||||
"errors"
|
||||
"fmt"
|
||||
"hash/fnv"
|
||||
"io"
|
||||
"net/http"
|
||||
"regexp"
|
||||
"strings"
|
||||
"time"
|
||||
|
||||
"github.com/Azure/go-autorest/autorest"
|
||||
"github.com/Azure/go-autorest/autorest/azure/auth"
|
||||
|
||||
"github.com/influxdata/telegraf"
|
||||
"github.com/influxdata/telegraf/config"
|
||||
"github.com/influxdata/telegraf/internal"
|
||||
"github.com/influxdata/telegraf/metric"
|
||||
"github.com/influxdata/telegraf/plugins/outputs"
|
||||
"github.com/influxdata/telegraf/selfstat"
|
||||
)
|
||||
|
||||
//go:embed sample.conf
|
||||
var sampleConfig string
|
||||
|
||||
const (
|
||||
vmInstanceMetadataURL = "http://169.254.169.254/metadata/instance?api-version=2017-12-01"
|
||||
resourceIDTemplate = "/subscriptions/%s/resourceGroups/%s/providers/Microsoft.Compute/virtualMachines/%s"
|
||||
resourceIDScaleSetTemplate = "/subscriptions/%s/resourceGroups/%s/providers/Microsoft.Compute/virtualMachineScaleSets/%s"
|
||||
maxRequestBodySize = 4000000
|
||||
)
|
||||
|
||||
var invalidNameCharRE = regexp.MustCompile(`[^a-zA-Z0-9_]`)
|
||||
|
||||
type dimension struct {
|
||||
name string
|
||||
value string
|
||||
}
|
||||
|
||||
type aggregate struct {
|
||||
name string
|
||||
min float64
|
||||
max float64
|
||||
sum float64
|
||||
count int64
|
||||
dimensions []dimension
|
||||
updated bool
|
||||
}
|
||||
|
||||
type AzureMonitor struct {
|
||||
Timeout config.Duration `toml:"timeout"`
|
||||
NamespacePrefix string `toml:"namespace_prefix"`
|
||||
StringsAsDimensions bool `toml:"strings_as_dimensions"`
|
||||
Region string `toml:"region"`
|
||||
ResourceID string `toml:"resource_id"`
|
||||
EndpointURL string `toml:"endpoint_url"`
|
||||
TimestampLimitPast config.Duration `toml:"timestamp_limit_past"`
|
||||
TimestampLimitFuture config.Duration `toml:"timestamp_limit_future"`
|
||||
Log telegraf.Logger `toml:"-"`
|
||||
|
||||
url string
|
||||
preparer autorest.Preparer
|
||||
client *http.Client
|
||||
|
||||
cache map[time.Time]map[uint64]*aggregate
|
||||
timeFunc func() time.Time
|
||||
|
||||
MetricOutsideWindow selfstat.Stat
|
||||
}
|
||||
|
||||
func (*AzureMonitor) SampleConfig() string {
|
||||
return sampleConfig
|
||||
}
|
||||
|
||||
func (a *AzureMonitor) Init() error {
|
||||
a.cache = make(map[time.Time]map[uint64]*aggregate, 36)
|
||||
|
||||
authorizer, err := auth.NewAuthorizerFromEnvironmentWithResource("https://monitoring.azure.com/")
|
||||
if err != nil {
|
||||
return fmt.Errorf("creating authorizer failed: %w", err)
|
||||
}
|
||||
a.preparer = autorest.CreatePreparer(authorizer.WithAuthorization())
|
||||
|
||||
return nil
|
||||
}
|
||||
|
||||
func (a *AzureMonitor) Connect() error {
|
||||
a.client = &http.Client{
|
||||
Transport: &http.Transport{
|
||||
Proxy: http.ProxyFromEnvironment,
|
||||
},
|
||||
Timeout: time.Duration(a.Timeout),
|
||||
}
|
||||
|
||||
// If information is missing try to retrieve it from the Azure VM instance
|
||||
if a.Region == "" || a.ResourceID == "" {
|
||||
region, resourceID, err := vmInstanceMetadata(a.client)
|
||||
if err != nil {
|
||||
return fmt.Errorf("getting VM metadata failed: %w", err)
|
||||
}
|
||||
|
||||
if a.Region == "" {
|
||||
a.Region = region
|
||||
}
|
||||
|
||||
if a.ResourceID == "" {
|
||||
a.ResourceID = resourceID
|
||||
}
|
||||
}
|
||||
|
||||
if a.ResourceID == "" {
|
||||
return errors.New("no resource ID configured or available via VM instance metadata")
|
||||
}
|
||||
|
||||
if a.EndpointURL == "" {
|
||||
if a.Region == "" {
|
||||
return errors.New("no region configured or available via VM instance metadata")
|
||||
}
|
||||
a.url = fmt.Sprintf("https://%s.monitoring.azure.com%s/metrics", a.Region, a.ResourceID)
|
||||
} else {
|
||||
a.url = a.EndpointURL + a.ResourceID + "/metrics"
|
||||
}
|
||||
a.Log.Debugf("Writing to Azure Monitor URL: %s", a.url)
|
||||
|
||||
a.MetricOutsideWindow = selfstat.Register(
|
||||
"azure_monitor",
|
||||
"metric_outside_window",
|
||||
map[string]string{
|
||||
"region": a.Region,
|
||||
"resource_id": a.ResourceID,
|
||||
},
|
||||
)
|
||||
|
||||
a.Reset()
|
||||
|
||||
return nil
|
||||
}
|
||||
|
||||
// Close shuts down an any active connections
|
||||
func (a *AzureMonitor) Close() error {
|
||||
a.client.CloseIdleConnections()
|
||||
a.client = nil
|
||||
return nil
|
||||
}
|
||||
|
||||
// Add will append a metric to the output aggregate
|
||||
func (a *AzureMonitor) Add(m telegraf.Metric) {
|
||||
// Azure Monitor only supports aggregates 30 minutes into the past and 4
|
||||
// minutes into the future. Future metrics are dropped when pushed.
|
||||
tbucket := m.Time().Truncate(time.Minute)
|
||||
if tbucket.Before(a.timeFunc().Add(-time.Duration(a.TimestampLimitPast))) {
|
||||
a.MetricOutsideWindow.Incr(1)
|
||||
return
|
||||
}
|
||||
|
||||
// Azure Monitor doesn't have a string value type, so convert string fields
|
||||
// to dimensions (a.k.a. tags) if enabled.
|
||||
if a.StringsAsDimensions {
|
||||
for _, f := range m.FieldList() {
|
||||
if v, ok := f.Value.(string); ok {
|
||||
m.AddTag(f.Key, v)
|
||||
}
|
||||
}
|
||||
}
|
||||
|
||||
for _, f := range m.FieldList() {
|
||||
fv, err := internal.ToFloat64(f.Value)
|
||||
if err != nil {
|
||||
continue
|
||||
}
|
||||
|
||||
// Azure Monitor does not support fields so the field name is appended
|
||||
// to the metric name.
|
||||
sanitizeKey := invalidNameCharRE.ReplaceAllString(f.Key, "_")
|
||||
name := m.Name() + "-" + sanitizeKey
|
||||
id := hashIDWithField(m.HashID(), f.Key)
|
||||
|
||||
// Create the time bucket if doesn't exist
|
||||
if _, ok := a.cache[tbucket]; !ok {
|
||||
a.cache[tbucket] = make(map[uint64]*aggregate)
|
||||
}
|
||||
|
||||
// Fetch existing aggregate
|
||||
agg, ok := a.cache[tbucket][id]
|
||||
if !ok {
|
||||
dimensions := make([]dimension, 0, len(m.TagList()))
|
||||
for _, tag := range m.TagList() {
|
||||
dimensions = append(dimensions, dimension{
|
||||
name: tag.Key,
|
||||
value: tag.Value,
|
||||
})
|
||||
}
|
||||
a.cache[tbucket][id] = &aggregate{
|
||||
name: name,
|
||||
dimensions: dimensions,
|
||||
min: fv,
|
||||
max: fv,
|
||||
sum: fv,
|
||||
count: 1,
|
||||
updated: true,
|
||||
}
|
||||
continue
|
||||
}
|
||||
|
||||
if fv < agg.min {
|
||||
agg.min = fv
|
||||
}
|
||||
if fv > agg.max {
|
||||
agg.max = fv
|
||||
}
|
||||
agg.sum += fv
|
||||
agg.count++
|
||||
agg.updated = true
|
||||
}
|
||||
}
|
||||
|
||||
// Push sends metrics to the output metric buffer
|
||||
func (a *AzureMonitor) Push() []telegraf.Metric {
|
||||
var metrics []telegraf.Metric
|
||||
for tbucket, aggs := range a.cache {
|
||||
// Do not send metrics early
|
||||
if tbucket.After(a.timeFunc().Add(time.Duration(a.TimestampLimitFuture))) {
|
||||
continue
|
||||
}
|
||||
for _, agg := range aggs {
|
||||
// Only send aggregates that have had an update since the last push.
|
||||
if !agg.updated {
|
||||
continue
|
||||
}
|
||||
|
||||
tags := make(map[string]string, len(agg.dimensions))
|
||||
for _, tag := range agg.dimensions {
|
||||
tags[tag.name] = tag.value
|
||||
}
|
||||
|
||||
m := metric.New(agg.name,
|
||||
tags,
|
||||
map[string]interface{}{
|
||||
"min": agg.min,
|
||||
"max": agg.max,
|
||||
"sum": agg.sum,
|
||||
"count": agg.count,
|
||||
},
|
||||
tbucket,
|
||||
)
|
||||
|
||||
metrics = append(metrics, m)
|
||||
}
|
||||
}
|
||||
return metrics
|
||||
}
|
||||
|
||||
// Reset clears the cache of aggregate metrics
|
||||
func (a *AzureMonitor) Reset() {
|
||||
for tbucket := range a.cache {
|
||||
// Remove aggregates older than 30 minutes
|
||||
if tbucket.Before(a.timeFunc().Add(-time.Duration(a.TimestampLimitPast))) {
|
||||
delete(a.cache, tbucket)
|
||||
continue
|
||||
}
|
||||
// Metrics updated within the latest 1m have not been pushed and should
|
||||
// not be cleared.
|
||||
if tbucket.After(a.timeFunc().Add(time.Duration(a.TimestampLimitFuture))) {
|
||||
continue
|
||||
}
|
||||
for id := range a.cache[tbucket] {
|
||||
a.cache[tbucket][id].updated = false
|
||||
}
|
||||
}
|
||||
}
|
||||
|
||||
// Write writes metrics to the remote endpoint
|
||||
func (a *AzureMonitor) Write(metrics []telegraf.Metric) error {
|
||||
now := a.timeFunc()
|
||||
tsEarliest := now.Add(-time.Duration(a.TimestampLimitPast))
|
||||
tsLatest := now.Add(time.Duration(a.TimestampLimitFuture))
|
||||
|
||||
writeErr := &internal.PartialWriteError{
|
||||
MetricsAccept: make([]int, 0, len(metrics)),
|
||||
}
|
||||
azmetrics := make(map[uint64]*azureMonitorMetric, len(metrics))
|
||||
for i, m := range metrics {
|
||||
// Skip metrics that our outside of the valid timespan
|
||||
if m.Time().Before(tsEarliest) || m.Time().After(tsLatest) {
|
||||
a.Log.Tracef("Metric outside acceptable time window: %v", m)
|
||||
a.MetricOutsideWindow.Incr(1)
|
||||
writeErr.Err = errors.New("metric(s) outside of acceptable time window")
|
||||
writeErr.MetricsReject = append(writeErr.MetricsReject, i)
|
||||
continue
|
||||
}
|
||||
|
||||
amm, err := translate(m, a.NamespacePrefix)
|
||||
if err != nil {
|
||||
a.Log.Errorf("Could not create azure metric for %q; discarding point", m.Name())
|
||||
if writeErr.Err == nil {
|
||||
writeErr.Err = errors.New("translating metric(s) failed")
|
||||
}
|
||||
writeErr.MetricsReject = append(writeErr.MetricsReject, i)
|
||||
continue
|
||||
}
|
||||
|
||||
id := hashIDWithTagKeysOnly(m)
|
||||
if azm, ok := azmetrics[id]; !ok {
|
||||
azmetrics[id] = amm
|
||||
azmetrics[id].index = i
|
||||
} else {
|
||||
azmetrics[id].Data.BaseData.Series = append(
|
||||
azm.Data.BaseData.Series,
|
||||
amm.Data.BaseData.Series...,
|
||||
)
|
||||
azmetrics[id].index = i
|
||||
}
|
||||
}
|
||||
|
||||
if len(azmetrics) == 0 {
|
||||
if writeErr.Err == nil {
|
||||
return nil
|
||||
}
|
||||
return writeErr
|
||||
}
|
||||
|
||||
var buffer bytes.Buffer
|
||||
buffer.Grow(maxRequestBodySize)
|
||||
batchIndices := make([]int, 0, len(azmetrics))
|
||||
for _, m := range azmetrics {
|
||||
// Azure Monitor accepts new batches of points in new-line delimited
|
||||
// JSON, following RFC 4288 (see https://github.com/ndjson/ndjson-spec).
|
||||
buf, err := json.Marshal(m)
|
||||
if err != nil {
|
||||
writeErr.MetricsReject = append(writeErr.MetricsReject, m.index)
|
||||
writeErr.Err = err
|
||||
continue
|
||||
}
|
||||
batchIndices = append(batchIndices, m.index)
|
||||
|
||||
// Azure Monitor's maximum request body size of 4MB. Send batches that
|
||||
// exceed this size via separate write requests.
|
||||
if buffer.Len()+len(buf)+1 > maxRequestBodySize {
|
||||
if retryable, err := a.send(buffer.Bytes()); err != nil {
|
||||
writeErr.Err = err
|
||||
if !retryable {
|
||||
writeErr.MetricsReject = append(writeErr.MetricsAccept, batchIndices...)
|
||||
}
|
||||
return writeErr
|
||||
}
|
||||
writeErr.MetricsAccept = append(writeErr.MetricsAccept, batchIndices...)
|
||||
batchIndices = make([]int, 0, len(azmetrics))
|
||||
buffer.Reset()
|
||||
}
|
||||
if _, err := buffer.Write(buf); err != nil {
|
||||
return fmt.Errorf("writing to buffer failed: %w", err)
|
||||
}
|
||||
if err := buffer.WriteByte('\n'); err != nil {
|
||||
return fmt.Errorf("writing to buffer failed: %w", err)
|
||||
}
|
||||
}
|
||||
|
||||
if retryable, err := a.send(buffer.Bytes()); err != nil {
|
||||
writeErr.Err = err
|
||||
if !retryable {
|
||||
writeErr.MetricsReject = append(writeErr.MetricsAccept, batchIndices...)
|
||||
}
|
||||
return writeErr
|
||||
}
|
||||
writeErr.MetricsAccept = append(writeErr.MetricsAccept, batchIndices...)
|
||||
|
||||
if writeErr.Err == nil {
|
||||
return nil
|
||||
}
|
||||
|
||||
return writeErr
|
||||
}
|
||||
|
||||
func (a *AzureMonitor) send(body []byte) (bool, error) {
|
||||
var buf bytes.Buffer
|
||||
g := gzip.NewWriter(&buf)
|
||||
if _, err := g.Write(body); err != nil {
|
||||
return false, fmt.Errorf("zipping content failed: %w", err)
|
||||
}
|
||||
if err := g.Close(); err != nil {
|
||||
return false, fmt.Errorf("closing gzip writer failed: %w", err)
|
||||
}
|
||||
|
||||
req, err := http.NewRequest("POST", a.url, &buf)
|
||||
if err != nil {
|
||||
return false, fmt.Errorf("creating request failed: %w", err)
|
||||
}
|
||||
|
||||
req.Header.Set("Content-Encoding", "gzip")
|
||||
req.Header.Set("Content-Type", "application/x-ndjson")
|
||||
|
||||
// Add the authorization header. WithAuthorization will automatically
|
||||
// refresh the token if needed.
|
||||
req, err = a.preparer.Prepare(req)
|
||||
if err != nil {
|
||||
return false, fmt.Errorf("unable to fetch authentication credentials: %w", err)
|
||||
}
|
||||
|
||||
resp, err := a.client.Do(req)
|
||||
if err != nil {
|
||||
if errors.Is(err, context.DeadlineExceeded) {
|
||||
a.client.CloseIdleConnections()
|
||||
a.client = &http.Client{
|
||||
Transport: &http.Transport{
|
||||
Proxy: http.ProxyFromEnvironment,
|
||||
},
|
||||
Timeout: time.Duration(a.Timeout),
|
||||
}
|
||||
}
|
||||
return true, err
|
||||
}
|
||||
defer resp.Body.Close()
|
||||
|
||||
if resp.StatusCode >= 200 && resp.StatusCode <= 299 {
|
||||
return false, nil
|
||||
}
|
||||
|
||||
retryable := resp.StatusCode != 400
|
||||
if respbody, err := io.ReadAll(resp.Body); err == nil {
|
||||
return retryable, fmt.Errorf("failed to write batch: [%d] %s: %s", resp.StatusCode, resp.Status, string(respbody))
|
||||
}
|
||||
|
||||
return retryable, fmt.Errorf("failed to write batch: [%d] %s", resp.StatusCode, resp.Status)
|
||||
}
|
||||
|
||||
// vmMetadata retrieves metadata about the current Azure VM
|
||||
func vmInstanceMetadata(c *http.Client) (region, resourceID string, err error) {
|
||||
req, err := http.NewRequest("GET", vmInstanceMetadataURL, nil)
|
||||
if err != nil {
|
||||
return "", "", fmt.Errorf("error creating request: %w", err)
|
||||
}
|
||||
req.Header.Set("Metadata", "true")
|
||||
|
||||
resp, err := c.Do(req)
|
||||
if err != nil {
|
||||
return "", "", err
|
||||
}
|
||||
defer resp.Body.Close()
|
||||
|
||||
body, err := io.ReadAll(resp.Body)
|
||||
if err != nil {
|
||||
return "", "", err
|
||||
}
|
||||
if resp.StatusCode >= 300 || resp.StatusCode < 200 {
|
||||
return "", "", fmt.Errorf("unable to fetch instance metadata: [%s] %d",
|
||||
vmInstanceMetadataURL, resp.StatusCode)
|
||||
}
|
||||
|
||||
var metadata virtualMachineMetadata
|
||||
if err := json.Unmarshal(body, &metadata); err != nil {
|
||||
return "", "", err
|
||||
}
|
||||
|
||||
region = metadata.Compute.Location
|
||||
resourceID = metadata.ResourceID()
|
||||
|
||||
return region, resourceID, nil
|
||||
}
|
||||
|
||||
func hashIDWithField(id uint64, fk string) uint64 {
|
||||
h := fnv.New64a()
|
||||
b := make([]byte, binary.MaxVarintLen64)
|
||||
n := binary.PutUvarint(b, id)
|
||||
h.Write(b[:n])
|
||||
h.Write([]byte("\n"))
|
||||
h.Write([]byte(fk))
|
||||
h.Write([]byte("\n"))
|
||||
return h.Sum64()
|
||||
}
|
||||
|
||||
func hashIDWithTagKeysOnly(m telegraf.Metric) uint64 {
|
||||
h := fnv.New64a()
|
||||
h.Write([]byte(m.Name()))
|
||||
h.Write([]byte("\n"))
|
||||
for _, tag := range m.TagList() {
|
||||
if tag.Key == "" || tag.Value == "" {
|
||||
continue
|
||||
}
|
||||
|
||||
h.Write([]byte(tag.Key))
|
||||
h.Write([]byte("\n"))
|
||||
}
|
||||
b := make([]byte, binary.MaxVarintLen64)
|
||||
n := binary.PutUvarint(b, uint64(m.Time().UnixNano()))
|
||||
h.Write(b[:n])
|
||||
h.Write([]byte("\n"))
|
||||
return h.Sum64()
|
||||
}
|
||||
|
||||
func translate(m telegraf.Metric, prefix string) (*azureMonitorMetric, error) {
|
||||
dimensionNames := make([]string, 0, len(m.TagList()))
|
||||
dimensionValues := make([]string, 0, len(m.TagList()))
|
||||
for _, tag := range m.TagList() {
|
||||
// Azure custom metrics service supports up to 10 dimensions
|
||||
if len(dimensionNames) >= 10 {
|
||||
continue
|
||||
}
|
||||
|
||||
if tag.Key == "" || tag.Value == "" {
|
||||
continue
|
||||
}
|
||||
|
||||
dimensionNames = append(dimensionNames, tag.Key)
|
||||
dimensionValues = append(dimensionValues, tag.Value)
|
||||
}
|
||||
|
||||
vmin, err := getFloatField(m, "min")
|
||||
if err != nil {
|
||||
return nil, err
|
||||
}
|
||||
vmax, err := getFloatField(m, "max")
|
||||
if err != nil {
|
||||
return nil, err
|
||||
}
|
||||
vsum, err := getFloatField(m, "sum")
|
||||
if err != nil {
|
||||
return nil, err
|
||||
}
|
||||
vcount, err := getIntField(m, "count")
|
||||
if err != nil {
|
||||
return nil, err
|
||||
}
|
||||
|
||||
mn, ns := "Missing", "Missing"
|
||||
names := strings.SplitN(m.Name(), "-", 2)
|
||||
if len(names) > 1 {
|
||||
mn = names[1]
|
||||
}
|
||||
if len(names) > 0 {
|
||||
ns = names[0]
|
||||
}
|
||||
ns = prefix + ns
|
||||
|
||||
return &azureMonitorMetric{
|
||||
Time: m.Time(),
|
||||
Data: &azureMonitorData{
|
||||
BaseData: &azureMonitorBaseData{
|
||||
Metric: mn,
|
||||
Namespace: ns,
|
||||
DimensionNames: dimensionNames,
|
||||
Series: []*azureMonitorSeries{
|
||||
{
|
||||
DimensionValues: dimensionValues,
|
||||
Min: vmin,
|
||||
Max: vmax,
|
||||
Sum: vsum,
|
||||
Count: vcount,
|
||||
},
|
||||
},
|
||||
},
|
||||
},
|
||||
}, nil
|
||||
}
|
||||
|
||||
func getFloatField(m telegraf.Metric, key string) (float64, error) {
|
||||
fv, ok := m.GetField(key)
|
||||
if !ok {
|
||||
return 0, fmt.Errorf("missing field: %s", key)
|
||||
}
|
||||
|
||||
if value, ok := fv.(float64); ok {
|
||||
return value, nil
|
||||
}
|
||||
return 0, fmt.Errorf("unexpected type: %s: %T", key, fv)
|
||||
}
|
||||
|
||||
func getIntField(m telegraf.Metric, key string) (int64, error) {
|
||||
fv, ok := m.GetField(key)
|
||||
if !ok {
|
||||
return 0, fmt.Errorf("missing field: %s", key)
|
||||
}
|
||||
|
||||
if value, ok := fv.(int64); ok {
|
||||
return value, nil
|
||||
}
|
||||
return 0, fmt.Errorf("unexpected type: %s: %T", key, fv)
|
||||
}
|
||||
|
||||
func init() {
|
||||
outputs.Add("azure_monitor", func() telegraf.Output {
|
||||
return &AzureMonitor{
|
||||
NamespacePrefix: "Telegraf/",
|
||||
TimestampLimitPast: config.Duration(20 * time.Minute),
|
||||
TimestampLimitFuture: config.Duration(-1 * time.Minute),
|
||||
Timeout: config.Duration(5 * time.Second),
|
||||
timeFunc: time.Now,
|
||||
}
|
||||
})
|
||||
}
|
619
plugins/outputs/azure_monitor/azure_monitor_test.go
Normal file
619
plugins/outputs/azure_monitor/azure_monitor_test.go
Normal file
|
@ -0,0 +1,619 @@
|
|||
package azure_monitor
|
||||
|
||||
import (
|
||||
"bufio"
|
||||
"compress/gzip"
|
||||
"encoding/json"
|
||||
"net/http"
|
||||
"net/http/httptest"
|
||||
"sync/atomic"
|
||||
"testing"
|
||||
"time"
|
||||
|
||||
"github.com/Azure/go-autorest/autorest"
|
||||
"github.com/Azure/go-autorest/autorest/adal"
|
||||
"github.com/stretchr/testify/require"
|
||||
|
||||
"github.com/influxdata/telegraf"
|
||||
"github.com/influxdata/telegraf/config"
|
||||
"github.com/influxdata/telegraf/metric"
|
||||
"github.com/influxdata/telegraf/testutil"
|
||||
)
|
||||
|
||||
func TestAggregate(t *testing.T) {
|
||||
tests := []struct {
|
||||
name string
|
||||
stringdim bool
|
||||
metrics []telegraf.Metric
|
||||
addTime time.Time
|
||||
pushTime time.Time
|
||||
expected []telegraf.Metric
|
||||
expectedOutsideWindow int64
|
||||
}{
|
||||
{
|
||||
name: "add metric outside window is dropped",
|
||||
metrics: []telegraf.Metric{
|
||||
testutil.MustMetric(
|
||||
"cpu",
|
||||
map[string]string{},
|
||||
map[string]interface{}{
|
||||
"value": 42,
|
||||
},
|
||||
time.Unix(0, 0),
|
||||
),
|
||||
},
|
||||
addTime: time.Unix(3600, 0),
|
||||
pushTime: time.Unix(3600, 0),
|
||||
expectedOutsideWindow: 1,
|
||||
},
|
||||
{
|
||||
name: "metric not sent until period expires",
|
||||
metrics: []telegraf.Metric{
|
||||
testutil.MustMetric(
|
||||
"cpu",
|
||||
map[string]string{},
|
||||
map[string]interface{}{
|
||||
"value": 42,
|
||||
},
|
||||
time.Unix(0, 0),
|
||||
),
|
||||
},
|
||||
addTime: time.Unix(0, 0),
|
||||
pushTime: time.Unix(0, 0),
|
||||
},
|
||||
{
|
||||
name: "add strings as dimensions",
|
||||
stringdim: true,
|
||||
metrics: []telegraf.Metric{
|
||||
testutil.MustMetric(
|
||||
"cpu",
|
||||
map[string]string{
|
||||
"host": "localhost",
|
||||
},
|
||||
map[string]interface{}{
|
||||
"value": 42,
|
||||
"message": "howdy",
|
||||
},
|
||||
time.Unix(0, 0),
|
||||
),
|
||||
},
|
||||
addTime: time.Unix(0, 0),
|
||||
pushTime: time.Unix(3600, 0),
|
||||
expected: []telegraf.Metric{
|
||||
testutil.MustMetric(
|
||||
"cpu-value",
|
||||
map[string]string{
|
||||
"host": "localhost",
|
||||
"message": "howdy",
|
||||
},
|
||||
map[string]interface{}{
|
||||
"min": 42.0,
|
||||
"max": 42.0,
|
||||
"sum": 42.0,
|
||||
"count": 1,
|
||||
},
|
||||
time.Unix(0, 0),
|
||||
),
|
||||
},
|
||||
},
|
||||
{
|
||||
name: "add metric to cache and push",
|
||||
metrics: []telegraf.Metric{
|
||||
testutil.MustMetric(
|
||||
"cpu",
|
||||
map[string]string{},
|
||||
map[string]interface{}{
|
||||
"value": 42,
|
||||
},
|
||||
time.Unix(0, 0),
|
||||
),
|
||||
},
|
||||
addTime: time.Unix(0, 0),
|
||||
pushTime: time.Unix(3600, 0),
|
||||
expected: []telegraf.Metric{
|
||||
testutil.MustMetric(
|
||||
"cpu-value",
|
||||
map[string]string{},
|
||||
map[string]interface{}{
|
||||
"min": 42.0,
|
||||
"max": 42.0,
|
||||
"sum": 42.0,
|
||||
"count": 1,
|
||||
},
|
||||
time.Unix(0, 0),
|
||||
),
|
||||
},
|
||||
},
|
||||
{
|
||||
name: "added metric are aggregated",
|
||||
metrics: []telegraf.Metric{
|
||||
testutil.MustMetric(
|
||||
"cpu",
|
||||
map[string]string{},
|
||||
map[string]interface{}{
|
||||
"value": 42,
|
||||
},
|
||||
time.Unix(0, 0),
|
||||
),
|
||||
testutil.MustMetric(
|
||||
"cpu",
|
||||
map[string]string{},
|
||||
map[string]interface{}{
|
||||
"value": 84,
|
||||
},
|
||||
time.Unix(0, 0),
|
||||
),
|
||||
testutil.MustMetric(
|
||||
"cpu",
|
||||
map[string]string{},
|
||||
map[string]interface{}{
|
||||
"value": 2,
|
||||
},
|
||||
time.Unix(0, 0),
|
||||
),
|
||||
},
|
||||
addTime: time.Unix(0, 0),
|
||||
pushTime: time.Unix(3600, 0),
|
||||
expected: []telegraf.Metric{
|
||||
testutil.MustMetric(
|
||||
"cpu-value",
|
||||
map[string]string{},
|
||||
map[string]interface{}{
|
||||
"min": 2.0,
|
||||
"max": 84.0,
|
||||
"sum": 128.0,
|
||||
"count": 3,
|
||||
},
|
||||
time.Unix(0, 0),
|
||||
),
|
||||
},
|
||||
},
|
||||
}
|
||||
for _, tt := range tests {
|
||||
t.Run(tt.name, func(t *testing.T) {
|
||||
msiEndpoint, err := adal.GetMSIVMEndpoint()
|
||||
require.NoError(t, err)
|
||||
t.Setenv("MSI_ENDPOINT", msiEndpoint)
|
||||
|
||||
// Setup plugin
|
||||
plugin := &AzureMonitor{
|
||||
Region: "test",
|
||||
ResourceID: "/test",
|
||||
StringsAsDimensions: tt.stringdim,
|
||||
TimestampLimitPast: config.Duration(30 * time.Minute),
|
||||
TimestampLimitFuture: config.Duration(-1 * time.Minute),
|
||||
Log: testutil.Logger{},
|
||||
timeFunc: func() time.Time { return tt.addTime },
|
||||
}
|
||||
require.NoError(t, plugin.Init())
|
||||
require.NoError(t, plugin.Connect())
|
||||
defer plugin.Close()
|
||||
|
||||
// Reset statistics
|
||||
plugin.MetricOutsideWindow.Set(0)
|
||||
|
||||
// Add the data
|
||||
for _, m := range tt.metrics {
|
||||
plugin.Add(m)
|
||||
}
|
||||
|
||||
// Push out the data at a later time
|
||||
plugin.timeFunc = func() time.Time { return tt.pushTime }
|
||||
metrics := plugin.Push()
|
||||
plugin.Reset()
|
||||
|
||||
// Check the results
|
||||
require.Equal(t, tt.expectedOutsideWindow, plugin.MetricOutsideWindow.Get())
|
||||
testutil.RequireMetricsEqual(t, tt.expected, metrics)
|
||||
})
|
||||
}
|
||||
}
|
||||
|
||||
func TestWrite(t *testing.T) {
|
||||
// Set up a fake environment for Authorizer
|
||||
// This used to fake an MSI environment, but since https://github.com/Azure/go-autorest/pull/670/files it's no longer possible,
|
||||
// So we fake a user/password authentication
|
||||
t.Setenv("AZURE_CLIENT_ID", "fake")
|
||||
t.Setenv("AZURE_USERNAME", "fake")
|
||||
t.Setenv("AZURE_PASSWORD", "fake")
|
||||
|
||||
tests := []struct {
|
||||
name string
|
||||
metrics []telegraf.Metric
|
||||
expectedCalls uint64
|
||||
expectedMetrics uint64
|
||||
errmsg string
|
||||
}{
|
||||
{
|
||||
name: "if not an azure metric nothing is sent",
|
||||
metrics: []telegraf.Metric{
|
||||
testutil.MustMetric(
|
||||
"cpu",
|
||||
map[string]string{},
|
||||
map[string]interface{}{
|
||||
"value": 42,
|
||||
},
|
||||
time.Unix(0, 0),
|
||||
),
|
||||
},
|
||||
errmsg: "translating metric(s) failed",
|
||||
},
|
||||
{
|
||||
name: "single azure metric",
|
||||
metrics: []telegraf.Metric{
|
||||
testutil.MustMetric(
|
||||
"cpu-value",
|
||||
map[string]string{},
|
||||
map[string]interface{}{
|
||||
"min": float64(42),
|
||||
"max": float64(42),
|
||||
"sum": float64(42),
|
||||
"count": int64(1),
|
||||
},
|
||||
time.Unix(0, 0),
|
||||
),
|
||||
},
|
||||
expectedCalls: 1,
|
||||
expectedMetrics: 1,
|
||||
},
|
||||
{
|
||||
name: "multiple azure metric",
|
||||
metrics: []telegraf.Metric{
|
||||
testutil.MustMetric(
|
||||
"cpu-value",
|
||||
map[string]string{},
|
||||
map[string]interface{}{
|
||||
"min": float64(42),
|
||||
"max": float64(42),
|
||||
"sum": float64(42),
|
||||
"count": int64(1),
|
||||
},
|
||||
time.Unix(0, 0),
|
||||
),
|
||||
testutil.MustMetric(
|
||||
"cpu-value",
|
||||
map[string]string{},
|
||||
map[string]interface{}{
|
||||
"min": float64(42),
|
||||
"max": float64(42),
|
||||
"sum": float64(42),
|
||||
"count": int64(1),
|
||||
},
|
||||
time.Unix(60, 0),
|
||||
),
|
||||
},
|
||||
expectedCalls: 1,
|
||||
expectedMetrics: 2,
|
||||
},
|
||||
}
|
||||
for _, tt := range tests {
|
||||
t.Run(tt.name, func(t *testing.T) {
|
||||
// Setup test server to collect the sent metrics
|
||||
var calls atomic.Uint64
|
||||
var metrics atomic.Uint64
|
||||
ts := httptest.NewServer(http.HandlerFunc(func(w http.ResponseWriter, r *http.Request) {
|
||||
calls.Add(1)
|
||||
|
||||
gz, err := gzip.NewReader(r.Body)
|
||||
if err != nil {
|
||||
w.WriteHeader(http.StatusInternalServerError)
|
||||
t.Logf("cannot create gzip reader: %v", err)
|
||||
t.Fail()
|
||||
return
|
||||
}
|
||||
|
||||
scanner := bufio.NewScanner(gz)
|
||||
for scanner.Scan() {
|
||||
var m azureMonitorMetric
|
||||
if err := json.Unmarshal(scanner.Bytes(), &m); err != nil {
|
||||
w.WriteHeader(http.StatusInternalServerError)
|
||||
t.Logf("cannot unmarshal JSON: %v", err)
|
||||
t.Fail()
|
||||
return
|
||||
}
|
||||
metrics.Add(1)
|
||||
}
|
||||
w.WriteHeader(http.StatusOK)
|
||||
}))
|
||||
defer ts.Close()
|
||||
|
||||
// Setup the plugin
|
||||
plugin := AzureMonitor{
|
||||
EndpointURL: "http://" + ts.Listener.Addr().String(),
|
||||
Region: "test",
|
||||
ResourceID: "/test",
|
||||
TimestampLimitPast: config.Duration(30 * time.Minute),
|
||||
TimestampLimitFuture: config.Duration(-1 * time.Minute),
|
||||
Log: testutil.Logger{},
|
||||
timeFunc: func() time.Time { return time.Unix(120, 0) },
|
||||
}
|
||||
require.NoError(t, plugin.Init())
|
||||
|
||||
// Override with testing setup
|
||||
plugin.preparer = autorest.CreatePreparer(autorest.NullAuthorizer{}.WithAuthorization())
|
||||
require.NoError(t, plugin.Connect())
|
||||
defer plugin.Close()
|
||||
|
||||
err := plugin.Write(tt.metrics)
|
||||
if tt.errmsg != "" {
|
||||
require.ErrorContains(t, err, tt.errmsg)
|
||||
return
|
||||
}
|
||||
require.NoError(t, err)
|
||||
require.Equal(t, tt.expectedCalls, calls.Load())
|
||||
require.Equal(t, tt.expectedMetrics, metrics.Load())
|
||||
})
|
||||
}
|
||||
}
|
||||
|
||||
func TestWriteTimelimits(t *testing.T) {
|
||||
// Set up a fake environment for Authorizer
|
||||
// This used to fake an MSI environment, but since https://github.com/Azure/go-autorest/pull/670/files it's no longer possible,
|
||||
// So we fake a user/password authentication
|
||||
t.Setenv("AZURE_CLIENT_ID", "fake")
|
||||
t.Setenv("AZURE_USERNAME", "fake")
|
||||
t.Setenv("AZURE_PASSWORD", "fake")
|
||||
|
||||
// Setup input metrics
|
||||
tref := time.Now().Truncate(time.Minute)
|
||||
inputs := []telegraf.Metric{
|
||||
metric.New(
|
||||
"cpu-value",
|
||||
map[string]string{
|
||||
"status": "too old",
|
||||
},
|
||||
map[string]interface{}{
|
||||
"min": float64(42),
|
||||
"max": float64(42),
|
||||
"sum": float64(42),
|
||||
"count": int64(1),
|
||||
},
|
||||
tref.Add(-time.Hour),
|
||||
),
|
||||
metric.New(
|
||||
"cpu-value",
|
||||
map[string]string{
|
||||
"status": "30 min in the past",
|
||||
},
|
||||
map[string]interface{}{
|
||||
"min": float64(42),
|
||||
"max": float64(42),
|
||||
"sum": float64(42),
|
||||
"count": int64(1),
|
||||
},
|
||||
tref.Add(-30*time.Minute),
|
||||
),
|
||||
metric.New(
|
||||
"cpu-value",
|
||||
map[string]string{
|
||||
"status": "20 min in the past",
|
||||
},
|
||||
map[string]interface{}{
|
||||
"min": float64(42),
|
||||
"max": float64(42),
|
||||
"sum": float64(42),
|
||||
"count": int64(1),
|
||||
},
|
||||
tref.Add(-20*time.Minute),
|
||||
),
|
||||
metric.New(
|
||||
"cpu-value",
|
||||
map[string]string{
|
||||
"status": "10 min in the past",
|
||||
},
|
||||
map[string]interface{}{
|
||||
"min": float64(42),
|
||||
"max": float64(42),
|
||||
"sum": float64(42),
|
||||
"count": int64(1),
|
||||
},
|
||||
tref.Add(-10*time.Minute),
|
||||
),
|
||||
metric.New(
|
||||
"cpu-value",
|
||||
map[string]string{
|
||||
"status": "now",
|
||||
},
|
||||
map[string]interface{}{
|
||||
"min": float64(42),
|
||||
"max": float64(42),
|
||||
"sum": float64(42),
|
||||
"count": int64(1),
|
||||
},
|
||||
tref,
|
||||
),
|
||||
metric.New(
|
||||
"cpu-value",
|
||||
map[string]string{
|
||||
"status": "1 min in the future",
|
||||
},
|
||||
map[string]interface{}{
|
||||
"min": float64(42),
|
||||
"max": float64(42),
|
||||
"sum": float64(42),
|
||||
"count": int64(1),
|
||||
},
|
||||
tref.Add(1*time.Minute),
|
||||
),
|
||||
metric.New(
|
||||
"cpu-value",
|
||||
map[string]string{
|
||||
"status": "2 min in the future",
|
||||
},
|
||||
map[string]interface{}{
|
||||
"min": float64(42),
|
||||
"max": float64(42),
|
||||
"sum": float64(42),
|
||||
"count": int64(1),
|
||||
},
|
||||
tref.Add(2*time.Minute),
|
||||
),
|
||||
metric.New(
|
||||
"cpu-value",
|
||||
map[string]string{
|
||||
"status": "4 min in the future",
|
||||
},
|
||||
map[string]interface{}{
|
||||
"min": float64(42),
|
||||
"max": float64(42),
|
||||
"sum": float64(42),
|
||||
"count": int64(1),
|
||||
},
|
||||
tref.Add(4*time.Minute),
|
||||
),
|
||||
metric.New(
|
||||
"cpu-value",
|
||||
map[string]string{
|
||||
"status": "5 min in the future",
|
||||
},
|
||||
map[string]interface{}{
|
||||
"min": float64(42),
|
||||
"max": float64(42),
|
||||
"sum": float64(42),
|
||||
"count": int64(1),
|
||||
},
|
||||
tref.Add(5*time.Minute),
|
||||
),
|
||||
metric.New(
|
||||
"cpu-value",
|
||||
map[string]string{
|
||||
"status": "too far in the future",
|
||||
},
|
||||
map[string]interface{}{
|
||||
"min": float64(42),
|
||||
"max": float64(42),
|
||||
"sum": float64(42),
|
||||
"count": int64(1),
|
||||
},
|
||||
tref.Add(time.Hour),
|
||||
),
|
||||
}
|
||||
|
||||
// Error message for status 400
|
||||
msg := `{"error":{"code":"BadRequest","message":"'time' should not be older than 30 minutes and not more than 4 minutes in the future\r\n"}}`
|
||||
|
||||
tests := []struct {
|
||||
name string
|
||||
input []telegraf.Metric
|
||||
limitPast time.Duration
|
||||
limitFuture time.Duration
|
||||
expectedCount int
|
||||
expectedError string
|
||||
}{
|
||||
{
|
||||
name: "only good metrics",
|
||||
input: inputs[1 : len(inputs)-2],
|
||||
limitPast: 48 * time.Hour,
|
||||
limitFuture: 48 * time.Hour,
|
||||
expectedCount: len(inputs) - 3,
|
||||
},
|
||||
{
|
||||
name: "metrics out of bounds",
|
||||
input: inputs,
|
||||
limitPast: 48 * time.Hour,
|
||||
limitFuture: 48 * time.Hour,
|
||||
expectedCount: len(inputs),
|
||||
expectedError: "400 Bad Request: " + msg,
|
||||
},
|
||||
{
|
||||
name: "default limit",
|
||||
input: inputs,
|
||||
limitPast: 20 * time.Minute,
|
||||
limitFuture: -1 * time.Minute,
|
||||
expectedCount: 2,
|
||||
expectedError: "metric(s) outside of acceptable time window",
|
||||
},
|
||||
{
|
||||
name: "permissive limit",
|
||||
input: inputs,
|
||||
limitPast: 30 * time.Minute,
|
||||
limitFuture: 5 * time.Minute,
|
||||
expectedCount: len(inputs) - 2,
|
||||
expectedError: "metric(s) outside of acceptable time window",
|
||||
},
|
||||
{
|
||||
name: "very strict",
|
||||
input: inputs,
|
||||
limitPast: 19*time.Minute + 59*time.Second,
|
||||
limitFuture: 3*time.Minute + 59*time.Second,
|
||||
expectedCount: len(inputs) - 6,
|
||||
expectedError: "metric(s) outside of acceptable time window",
|
||||
},
|
||||
}
|
||||
for _, tt := range tests {
|
||||
t.Run(tt.name, func(t *testing.T) {
|
||||
// Counter for the number of received metrics
|
||||
var count atomic.Int32
|
||||
|
||||
// Setup test server
|
||||
ts := httptest.NewServer(http.HandlerFunc(func(w http.ResponseWriter, r *http.Request) {
|
||||
defer r.Body.Close()
|
||||
|
||||
reader, err := gzip.NewReader(r.Body)
|
||||
if err != nil {
|
||||
w.WriteHeader(http.StatusInternalServerError)
|
||||
t.Logf("unzipping content failed: %v", err)
|
||||
t.Fail()
|
||||
return
|
||||
}
|
||||
defer reader.Close()
|
||||
|
||||
status := http.StatusOK
|
||||
scanner := bufio.NewScanner(reader)
|
||||
for scanner.Scan() {
|
||||
var data map[string]interface{}
|
||||
if err := json.Unmarshal(scanner.Bytes(), &data); err != nil {
|
||||
w.WriteHeader(http.StatusInternalServerError)
|
||||
t.Logf("decoding JSON failed: %v", err)
|
||||
t.Fail()
|
||||
return
|
||||
}
|
||||
|
||||
timestamp, err := time.Parse(time.RFC3339, data["time"].(string))
|
||||
if err != nil {
|
||||
w.WriteHeader(http.StatusInternalServerError)
|
||||
t.Logf("decoding time failed: %v", err)
|
||||
t.Fail()
|
||||
return
|
||||
}
|
||||
if timestamp.Before(tref.Add(-30*time.Minute)) || timestamp.After(tref.Add(5*time.Minute)) {
|
||||
status = http.StatusBadRequest
|
||||
}
|
||||
count.Add(1)
|
||||
}
|
||||
w.WriteHeader(status)
|
||||
if status == 400 {
|
||||
//nolint:errcheck // Ignoring returned error as it is not relevant for the test
|
||||
w.Write([]byte(msg))
|
||||
}
|
||||
}))
|
||||
defer ts.Close()
|
||||
|
||||
// Setup plugin
|
||||
plugin := AzureMonitor{
|
||||
EndpointURL: "http://" + ts.Listener.Addr().String(),
|
||||
Region: "test",
|
||||
ResourceID: "/test",
|
||||
TimestampLimitPast: config.Duration(tt.limitPast),
|
||||
TimestampLimitFuture: config.Duration(tt.limitFuture),
|
||||
Log: testutil.Logger{},
|
||||
timeFunc: func() time.Time { return tref },
|
||||
}
|
||||
require.NoError(t, plugin.Init())
|
||||
|
||||
// Override with testing setup
|
||||
plugin.preparer = autorest.CreatePreparer(autorest.NullAuthorizer{}.WithAuthorization())
|
||||
require.NoError(t, plugin.Connect())
|
||||
defer plugin.Close()
|
||||
|
||||
// Test writing
|
||||
err := plugin.Write(tt.input)
|
||||
if tt.expectedError == "" {
|
||||
require.NoError(t, err)
|
||||
} else {
|
||||
require.ErrorContains(t, err, tt.expectedError)
|
||||
}
|
||||
require.Equal(t, tt.expectedCount, int(count.Load()))
|
||||
})
|
||||
}
|
||||
}
|
37
plugins/outputs/azure_monitor/sample.conf
Normal file
37
plugins/outputs/azure_monitor/sample.conf
Normal file
|
@ -0,0 +1,37 @@
|
|||
# Send aggregate metrics to Azure Monitor
|
||||
[[outputs.azure_monitor]]
|
||||
## Timeout for HTTP writes.
|
||||
# timeout = "20s"
|
||||
|
||||
## Set the namespace prefix, defaults to "Telegraf/<input-name>".
|
||||
# namespace_prefix = "Telegraf/"
|
||||
|
||||
## Azure Monitor doesn't have a string value type, so convert string
|
||||
## fields to dimensions (a.k.a. tags) if enabled. Azure Monitor allows
|
||||
## a maximum of 10 dimensions so Telegraf will only send the first 10
|
||||
## alphanumeric dimensions.
|
||||
# strings_as_dimensions = false
|
||||
|
||||
## Both region and resource_id must be set or be available via the
|
||||
## Instance Metadata service on Azure Virtual Machines.
|
||||
#
|
||||
## Azure Region to publish metrics against.
|
||||
## ex: region = "southcentralus"
|
||||
# region = ""
|
||||
#
|
||||
## The Azure Resource ID against which metric will be logged, e.g.
|
||||
## ex: resource_id = "/subscriptions/<subscription_id>/resourceGroups/<resource_group>/providers/Microsoft.Compute/virtualMachines/<vm_name>"
|
||||
# resource_id = ""
|
||||
|
||||
## Optionally, if in Azure US Government, China, or other sovereign
|
||||
## cloud environment, set the appropriate REST endpoint for receiving
|
||||
## metrics. (Note: region may be unused in this context)
|
||||
# endpoint_url = "https://monitoring.core.usgovcloudapi.net"
|
||||
|
||||
## Time limitations of metric to send
|
||||
## Documentation can be found here:
|
||||
## https://learn.microsoft.com/en-us/azure/azure-monitor/essentials/metrics-store-custom-rest-api?tabs=rest#timestamp
|
||||
## However, the returned (400) error message might document more strict or
|
||||
## relaxed settings. By default, only past metrics witin the limit are sent.
|
||||
# timestamp_limit_past = "30m"
|
||||
# timestamp_limit_future = "-1m"
|
60
plugins/outputs/azure_monitor/types.go
Normal file
60
plugins/outputs/azure_monitor/types.go
Normal file
|
@ -0,0 +1,60 @@
|
|||
package azure_monitor
|
||||
|
||||
import (
|
||||
"fmt"
|
||||
"time"
|
||||
)
|
||||
|
||||
type azureMonitorMetric struct {
|
||||
Time time.Time `json:"time"`
|
||||
Data *azureMonitorData `json:"data"`
|
||||
index int
|
||||
}
|
||||
|
||||
type azureMonitorData struct {
|
||||
BaseData *azureMonitorBaseData `json:"baseData"`
|
||||
}
|
||||
|
||||
type azureMonitorBaseData struct {
|
||||
Metric string `json:"metric"`
|
||||
Namespace string `json:"namespace"`
|
||||
DimensionNames []string `json:"dimNames"`
|
||||
Series []*azureMonitorSeries `json:"series"`
|
||||
}
|
||||
|
||||
type azureMonitorSeries struct {
|
||||
DimensionValues []string `json:"dimValues"`
|
||||
Min float64 `json:"min"`
|
||||
Max float64 `json:"max"`
|
||||
Sum float64 `json:"sum"`
|
||||
Count int64 `json:"count"`
|
||||
}
|
||||
|
||||
// VirtualMachineMetadata contains information about a VM from the metadata service
|
||||
type virtualMachineMetadata struct {
|
||||
Compute struct {
|
||||
Location string `json:"location"`
|
||||
Name string `json:"name"`
|
||||
ResourceGroupName string `json:"resourceGroupName"`
|
||||
SubscriptionID string `json:"subscriptionId"`
|
||||
VMScaleSetName string `json:"vmScaleSetName"`
|
||||
} `json:"compute"`
|
||||
}
|
||||
|
||||
func (m *virtualMachineMetadata) ResourceID() string {
|
||||
if m.Compute.VMScaleSetName != "" {
|
||||
return fmt.Sprintf(
|
||||
resourceIDScaleSetTemplate,
|
||||
m.Compute.SubscriptionID,
|
||||
m.Compute.ResourceGroupName,
|
||||
m.Compute.VMScaleSetName,
|
||||
)
|
||||
}
|
||||
|
||||
return fmt.Sprintf(
|
||||
resourceIDTemplate,
|
||||
m.Compute.SubscriptionID,
|
||||
m.Compute.ResourceGroupName,
|
||||
m.Compute.Name,
|
||||
)
|
||||
}
|
Loading…
Add table
Add a link
Reference in a new issue