300 lines
14 KiB
Markdown
300 lines
14 KiB
Markdown
# Azure Data Explorer Output Plugin
|
|
|
|
This plugin writes metrics to the [Azure Data Explorer][data_explorer],
|
|
[Azure Synapse Data Explorer][synapse], and
|
|
[Real time analytics in Fabric][fabric] services.
|
|
|
|
Azure Data Explorer is a distributed, columnar store, purpose built for any
|
|
type of logs, metrics and time series data.
|
|
|
|
⭐ Telegraf v1.20.0
|
|
🏷️ cloud, datastore
|
|
💻 all
|
|
|
|
[data_explorer]: https://docs.microsoft.com/en-us/azure/data-explorer
|
|
[synapse]: https://docs.microsoft.com/en-us/azure/synapse-analytics/data-explorer/data-explorer-overview
|
|
[fabric]: https://learn.microsoft.com/en-us/fabric/real-time-analytics/overview
|
|
|
|
## Pre-requisites
|
|
|
|
- [Create Azure Data Explorer cluster and
|
|
database](https://docs.microsoft.com/en-us/azure/data-explorer/create-cluster-database-portal)
|
|
- VM/compute or container to host Telegraf - it could be hosted locally where an
|
|
app/service to be monitored is deployed or remotely on a dedicated monitoring
|
|
compute/container.
|
|
|
|
## Global configuration options <!-- @/docs/includes/plugin_config.md -->
|
|
|
|
In addition to the plugin-specific configuration settings, plugins support
|
|
additional global and plugin configuration settings. These settings are used to
|
|
modify metrics, tags, and field or create aliases and configure ordering, etc.
|
|
See the [CONFIGURATION.md][CONFIGURATION.md] for more details.
|
|
|
|
[CONFIGURATION.md]: ../../../docs/CONFIGURATION.md#plugins
|
|
|
|
## Configuration
|
|
|
|
```toml @sample.conf
|
|
# Sends metrics to Azure Data Explorer
|
|
[[outputs.azure_data_explorer]]
|
|
## The URI property of the Azure Data Explorer resource on Azure
|
|
## ex: endpoint_url = https://myadxresource.australiasoutheast.kusto.windows.net
|
|
endpoint_url = ""
|
|
|
|
## The Azure Data Explorer database that the metrics will be ingested into.
|
|
## The plugin will NOT generate this database automatically, it's expected that this database already exists before ingestion.
|
|
## ex: "exampledatabase"
|
|
database = ""
|
|
|
|
## Timeout for Azure Data Explorer operations
|
|
# timeout = "20s"
|
|
|
|
## Type of metrics grouping used when pushing to Azure Data Explorer.
|
|
## Default is "TablePerMetric" for one table per different metric.
|
|
## For more information, please check the plugin README.
|
|
# metrics_grouping_type = "TablePerMetric"
|
|
|
|
## Name of the single table to store all the metrics (Only needed if metrics_grouping_type is "SingleTable").
|
|
# table_name = ""
|
|
|
|
## Creates tables and relevant mapping if set to true(default).
|
|
## Skips table and mapping creation if set to false, this is useful for running Telegraf with the lowest possible permissions i.e. table ingestor role.
|
|
# create_tables = true
|
|
|
|
## Ingestion method to use.
|
|
## Available options are
|
|
## - managed -- streaming ingestion with fallback to batched ingestion or the "queued" method below
|
|
## - queued -- queue up metrics data and process sequentially
|
|
# ingestion_type = "queued"
|
|
```
|
|
|
|
## Metrics Grouping
|
|
|
|
Metrics can be grouped in two ways to be sent to Azure Data Explorer. To specify
|
|
which metric grouping type the plugin should use, the respective value should be
|
|
given to the `metrics_grouping_type` in the config file. If no value is given to
|
|
`metrics_grouping_type`, by default, the metrics will be grouped using
|
|
`TablePerMetric`.
|
|
|
|
### TablePerMetric
|
|
|
|
The plugin will group the metrics by the metric name, and will send each group
|
|
of metrics to an Azure Data Explorer table. If the table doesn't exist the
|
|
plugin will create the table, if the table exists then the plugin will try to
|
|
merge the Telegraf metric schema to the existing table. For more information
|
|
about the merge process check the [`.create-merge` documentation][create-merge].
|
|
|
|
The table name will match the `name` property of the metric, this means that the
|
|
name of the metric should comply with the Azure Data Explorer table naming
|
|
constraints in case you plan to add a prefix to the metric name.
|
|
|
|
[create-merge]: https://docs.microsoft.com/en-us/azure/data-explorer/kusto/management/create-merge-table-command
|
|
|
|
### SingleTable
|
|
|
|
The plugin will send all the metrics received to a single Azure Data Explorer
|
|
table. The name of the table must be supplied via `table_name` in the config
|
|
file. If the table doesn't exist the plugin will create the table, if the table
|
|
exists then the plugin will try to merge the Telegraf metric schema to the
|
|
existing table. For more information about the merge process check the
|
|
[`.create-merge` documentation][create-merge].
|
|
|
|
## Tables Schema
|
|
|
|
The schema of the Azure Data Explorer table will match the structure of the
|
|
Telegraf `Metric` object. The corresponding Azure Data Explorer command
|
|
generated by the plugin would be like the following:
|
|
|
|
```text
|
|
.create-merge table ['table-name'] (['fields']:dynamic, ['name']:string, ['tags']:dynamic, ['timestamp']:datetime)
|
|
```
|
|
|
|
The corresponding table mapping would be like the following:
|
|
|
|
```text
|
|
.create-or-alter table ['table-name'] ingestion json mapping 'table-name_mapping' '[{"column":"fields", "Properties":{"Path":"$[\'fields\']"}},{"column":"name", "Properties":{"Path":"$[\'name\']"}},{"column":"tags", "Properties":{"Path":"$[\'tags\']"}},{"column":"timestamp", "Properties":{"Path":"$[\'timestamp\']"}}]'
|
|
```
|
|
|
|
**Note**: This plugin will automatically create Azure Data Explorer tables and
|
|
corresponding table mapping as per the above mentioned commands.
|
|
|
|
## Ingestion type
|
|
|
|
**Note**:
|
|
[Streaming ingestion](https://aka.ms/AAhlg6s)
|
|
has to be enabled on ADX [configure the ADX cluster]
|
|
in case of `managed` option.
|
|
Refer the query below to check if streaming is enabled
|
|
|
|
```kql
|
|
.show database <DB-Name> policy streamingingestion
|
|
```
|
|
|
|
## Authentication
|
|
|
|
### Supported Authentication Methods
|
|
|
|
This plugin provides several types of authentication. The plugin will check the
|
|
existence of several specific environment variables, and consequently will
|
|
choose the right method.
|
|
|
|
These methods are:
|
|
|
|
1. AAD Application Tokens (Service Principals with secrets or certificates).
|
|
|
|
For guidance on how to create and register an App in Azure Active Directory
|
|
check [this article][register], and for more information on the Service
|
|
Principals check [this article][principal].
|
|
|
|
2. AAD User Tokens
|
|
|
|
- Allows Telegraf to authenticate like a user. This method is mainly used
|
|
for development purposes only.
|
|
|
|
3. Managed Service Identity (MSI) token
|
|
|
|
- If you are running Telegraf from Azure VM or infrastructure, then this is
|
|
the preferred authentication method.
|
|
|
|
[register]: https://docs.microsoft.com/en-us/azure/active-directory/develop/quickstart-register-app#register-an-application
|
|
|
|
[principal]: https://docs.microsoft.com/en-us/azure/active-directory/develop/app-objects-and-service-principals
|
|
|
|
Whichever method, the designated Principal needs to be assigned the `Database
|
|
User` role on the Database level in the Azure Data Explorer. This role will
|
|
allow the plugin to create the required tables and ingest data into it. If
|
|
`create_tables=false` then the designated principal only needs the `Database
|
|
Ingestor` role at least.
|
|
|
|
### Configurations of the chosen Authentication Method
|
|
|
|
The plugin will authenticate using the first available of the following
|
|
configurations, **it's important to understand that the assessment, and
|
|
consequently choosing the authentication method, will happen in order as
|
|
below**:
|
|
|
|
1. **Client Credentials**: Azure AD Application ID and Secret.
|
|
|
|
Set the following environment variables:
|
|
|
|
- `AZURE_TENANT_ID`: Specifies the Tenant to which to authenticate.
|
|
- `AZURE_CLIENT_ID`: Specifies the app client ID to use.
|
|
- `AZURE_CLIENT_SECRET`: Specifies the app secret to use.
|
|
|
|
2. **Client Certificate**: Azure AD Application ID and X.509 Certificate.
|
|
|
|
- `AZURE_TENANT_ID`: Specifies the Tenant to which to authenticate.
|
|
- `AZURE_CLIENT_ID`: Specifies the app client ID to use.
|
|
- `AZURE_CERTIFICATE_PATH`: Specifies the certificate Path to use.
|
|
- `AZURE_CERTIFICATE_PASSWORD`: Specifies the certificate password to use.
|
|
|
|
3. **Resource Owner Password**: Azure AD User and Password. This grant type is
|
|
*not recommended*, use device login instead if you need interactive login.
|
|
|
|
- `AZURE_TENANT_ID`: Specifies the Tenant to which to authenticate.
|
|
- `AZURE_CLIENT_ID`: Specifies the app client ID to use.
|
|
- `AZURE_USERNAME`: Specifies the username to use.
|
|
- `AZURE_PASSWORD`: Specifies the password to use.
|
|
|
|
4. **Azure Managed Service Identity**: Delegate credential management to the
|
|
platform. Requires that code is running in Azure, e.g. on a VM. All
|
|
configuration is handled by Azure. See [Azure Managed Service Identity][msi]
|
|
for more details. Only available when using the [Azure Resource
|
|
Manager][arm].
|
|
|
|
[msi]: https://docs.microsoft.com/en-us/azure/active-directory/msi-overview
|
|
[arm]: https://docs.microsoft.com/en-us/azure/azure-resource-manager/resource-group-overview
|
|
|
|
## Querying data collected in Azure Data Explorer
|
|
|
|
Examples of data transformations and queries that would be useful to gain
|
|
insights -
|
|
|
|
### Using SQL input plugin
|
|
|
|
Sample SQL metrics data -
|
|
|
|
name | tags | timestamp | fields
|
|
-----|------|-----------|-------
|
|
sqlserver_database_io|{"database_name":"azure-sql-db2","file_type":"DATA","host":"adx-vm","logical_filename":"tempdev","measurement_db_type":"AzureSQLDB","physical_filename":"tempdb.mdf","replica_updateability":"READ_WRITE","sql_instance":"adx-sql-server"}|2021-09-09T13:51:20Z|{"current_size_mb":16,"database_id":2,"file_id":1,"read_bytes":2965504,"read_latency_ms":68,"reads":47,"rg_read_stall_ms":42,"rg_write_stall_ms":0,"space_used_mb":0,"write_bytes":1220608,"write_latency_ms":103,"writes":149}
|
|
sqlserver_waitstats|{"database_name":"azure-sql-db2","host":"adx-vm","measurement_db_type":"AzureSQLDB","replica_updateability":"READ_WRITE","sql_instance":"adx-sql-server","wait_category":"Worker Thread","wait_type":"THREADPOOL"}|2021-09-09T13:51:20Z|{"max_wait_time_ms":15,"resource_wait_ms":4469,"signal_wait_time_ms":0,"wait_time_ms":4469,"waiting_tasks_count":1464}
|
|
|
|
Since collected metrics object is of complex type so "fields" and "tags" are
|
|
stored as dynamic data type, multiple ways to query this data-
|
|
|
|
1. Query JSON attributes directly: Azure Data Explorer provides an ability to
|
|
query JSON data in raw format without parsing it, so JSON attributes can be
|
|
queried directly in following way:
|
|
|
|
```text
|
|
Tablename
|
|
| where name == "sqlserver_azure_db_resource_stats" and todouble(fields.avg_cpu_percent) > 7
|
|
```
|
|
|
|
```text
|
|
Tablename
|
|
| distinct tostring(tags.database_name)
|
|
```
|
|
|
|
**Note** - This approach could have performance impact in case of large
|
|
volumes of data, use below mentioned approach for such cases.
|
|
|
|
1. Use [Update
|
|
policy](https://docs.microsoft.com/en-us/azure/data-explorer/kusto/management/updatepolicy)**:
|
|
Transform dynamic data type columns using update policy. This is the
|
|
recommended performant way for querying over large volumes of data compared
|
|
to querying directly over JSON attributes:
|
|
|
|
```json
|
|
// Function to transform data
|
|
.create-or-alter function Transform_TargetTableName() {
|
|
SourceTableName
|
|
| mv-apply fields on (extend key = tostring(bag_keys(fields)[0]))
|
|
| project fieldname=key, value=todouble(fields[key]), name, tags, timestamp
|
|
}
|
|
|
|
// Create destination table with above query's results schema (if it doesn't exist already)
|
|
.set-or-append TargetTableName <| Transform_TargetTableName() | limit 0
|
|
|
|
// Apply update policy on destination table
|
|
.alter table TargetTableName policy update
|
|
@'[{"IsEnabled": true, "Source": "SourceTableName", "Query": "Transform_TargetTableName()", "IsTransactional": true, "PropagateIngestionProperties": false}]'
|
|
```
|
|
|
|
### Using syslog input plugin
|
|
|
|
Sample syslog data -
|
|
|
|
name | tags | timestamp | fields
|
|
-----|------|-----------|-------
|
|
syslog|{"appname":"azsecmond","facility":"user","host":"adx-linux-vm","hostname":"adx-linux-vm","severity":"info"}|2021-09-20T14:36:44Z|{"facility_code":1,"message":" 2021/09/20 14:36:44.890110 Failed to connect to mdsd: dial unix /var/run/mdsd/default_djson.socket: connect: no such file or directory","procid":"2184","severity_code":6,"timestamp":"1632148604890477000","version":1}
|
|
syslog|{"appname":"CRON","facility":"authpriv","host":"adx-linux-vm","hostname":"adx-linux-vm","severity":"info"}|2021-09-20T14:37:01Z|{"facility_code":10,"message":" pam_unix(cron:session): session opened for user root by (uid=0)","procid":"26446","severity_code":6,"timestamp":"1632148621120781000","version":1}
|
|
|
|
There are multiple ways to flatten dynamic columns using 'extend' or
|
|
'bag_unpack' operator. You can use either of these ways in above mentioned
|
|
update policy function - 'Transform_TargetTableName()'
|
|
|
|
- Use
|
|
[extend](https://docs.microsoft.com/en-us/azure/data-explorer/kusto/query/extendoperator)
|
|
operator - This is the recommended approach compared to 'bag_unpack' as it is
|
|
faster and robust. Even if schema changes, it will not break queries or
|
|
dashboards.
|
|
|
|
```text
|
|
Tablenmae
|
|
| extend facility_code=toint(fields.facility_code), message=tostring(fields.message), procid= tolong(fields.procid), severity_code=toint(fields.severity_code),
|
|
SysLogTimestamp=unixtime_nanoseconds_todatetime(tolong(fields.timestamp)), version= todouble(fields.version),
|
|
appname= tostring(tags.appname), facility= tostring(tags.facility),host= tostring(tags.host), hostname=tostring(tags.hostname), severity=tostring(tags.severity)
|
|
| project-away fields, tags
|
|
```
|
|
|
|
- Use [bag_unpack
|
|
plugin](https://docs.microsoft.com/en-us/azure/data-explorer/kusto/query/bag-unpackplugin)
|
|
to unpack the dynamic type columns automatically. This method could lead to
|
|
issues if source schema changes as its dynamically expanding columns.
|
|
|
|
```text
|
|
Tablename
|
|
| evaluate bag_unpack(tags, columnsConflict='replace_source')
|
|
| evaluate bag_unpack(fields, columnsConflict='replace_source')
|
|
```
|