Adding upstream version 1.34.4.

Signed-off-by: Daniel Baumann <daniel@debian.org>
2025-05-24 07:26:29 +02:00 · 2025-05-24 07:26:29 +02:00 · 4978089aab
commit 4978089aab
parent e393c3af3f
4963 changed files with 677545 additions and 0 deletions
--- a/plugins/parsers/grok/README.md
+++ b/plugins/parsers/grok/README.md
@ -0,0 +1,272 @@
+# Grok Parser Plugin
+
+The grok data format parses line delimited data using a regular expression like
+language.
+
+The best way to get acquainted with grok patterns is to read the logstash docs,
+which are available [here][1].
+
+The grok parser uses a slightly modified version of logstash "grok"
+patterns, with the format:
+
+```text
+%{<capture_syntax>[:<semantic_name>][:<modifier>]}
+```
+
+The `capture_syntax` defines the grok pattern that's used to parse the input
+line and the `semantic_name` is used to name the field or tag.  The extension
+`modifier` controls the data type that the parsed item is converted to or
+other special handling.
+
+By default all named captures are converted into string fields.
+If a pattern does not have a semantic name it will not be captured.
+Timestamp modifiers can be used to convert captures to the timestamp of the
+parsed metric.  If no timestamp is parsed the metric will be created using the
+current time.
+
+You must capture at least one field per line.
+
+- Available modifiers:
+  - string   (default if nothing is specified)
+  - int
+  - float
+  - duration (ie, 5.23ms gets converted to int nanoseconds)
+  - tag      (converts the field into a tag)
+  - drop     (drops the field completely)
+  - measurement (use the matched text as the measurement name)
+- Timestamp modifiers:
+  - ts               (This will auto-learn the timestamp format)
+  - ts-ansic         ("Mon Jan _2 15:04:05 2006")
+  - ts-unix          ("Mon Jan _2 15:04:05 MST 2006")
+  - ts-ruby          ("Mon Jan 02 15:04:05 -0700 2006")
+  - ts-rfc822        ("02 Jan 06 15:04 MST")
+  - ts-rfc822z       ("02 Jan 06 15:04 -0700")
+  - ts-rfc850        ("Monday, 02-Jan-06 15:04:05 MST")
+  - ts-rfc1123       ("Mon, 02 Jan 2006 15:04:05 MST")
+  - ts-rfc1123z      ("Mon, 02 Jan 2006 15:04:05 -0700")
+  - ts-rfc3339       ("2006-01-02T15:04:05Z07:00")
+  - ts-rfc3339nano   ("2006-01-02T15:04:05.999999999Z07:00")
+  - ts-httpd         ("02/Jan/2006:15:04:05 -0700")
+  - ts-epoch         (seconds since unix epoch, may contain decimal)
+  - ts-epochnano     (nanoseconds since unix epoch)
+  - ts-epochmilli    (milliseconds since unix epoch)
+  - ts-syslog        ("Jan 02 15:04:05", parsed time is set to the current year)
+  - ts-"CUSTOM"
+
+CUSTOM time layouts must be within quotes and be the representation of the
+"reference time", which is `Mon Jan 2 15:04:05 -0700 MST 2006`.  To match a
+comma decimal point you can use a period.  For example
+`%{TIMESTAMP:timestamp:ts-"2006-01-02 15:04:05.000"}` can be used to match
+`"2018-01-02 15:04:05,000"` To match a comma decimal point you can use a period
+in the pattern string.  See [Goloang Time
+docs](https://golang.org/pkg/time/#Parse) for more details.
+
+Telegraf has many of its own [built-in patterns][] as well as support for most
+of the Logstash builtin patterns using [these Go compatible
+patterns][grok-patterns].
+
+**Note:** Golang regular expressions do not support lookahead or lookbehind.
+Logstash patterns that use these features may not be supported, or may use a Go
+friendly pattern that is not fully compatible with the Logstash pattern.
+
+[built-in patterns]: /plugins/parsers/grok/influx_patterns.go
+[grok-patterns]: https://github.com/vjeantet/grok/blob/master/patterns/grok-patterns
+
+If you need help building patterns to match your logs, you will find the [Grok
+Debug](https://grokdebug.herokuapp.com) application quite useful!
+
+[1]: https://www.elastic.co/guide/en/logstash/current/plugins-filters-grok.html
+
+## Configuration
+
+```toml
+[[inputs.file]]
+  ## Files to parse each interval.
+  ## These accept standard unix glob matching rules, but with the addition of
+  ## ** as a "super asterisk". ie:
+  ##   /var/log/**.log     -> recursively find all .log files in /var/log
+  ##   /var/log/*/*.log    -> find all .log files with a parent dir in /var/log
+  ##   /var/log/apache.log -> only tail the apache log file
+  files = ["/var/log/apache/access.log"]
+
+  ## The dataformat to be read from files
+  ## Each data format has its own unique set of configuration options, read
+  ## more about them here:
+  ## https://github.com/influxdata/telegraf/blob/master/docs/DATA_FORMATS_INPUT.md
+  data_format = "grok"
+
+  ## This is a list of patterns to check the given log file(s) for.
+  ## Note that adding patterns here increases processing time. The most
+  ## efficient configuration is to have one pattern.
+  ## Other common built-in patterns are:
+  ##   %{COMMON_LOG_FORMAT}   (plain apache & nginx access logs)
+  ##   %{COMBINED_LOG_FORMAT} (access logs + referrer & agent)
+  grok_patterns = ["%{COMBINED_LOG_FORMAT}"]
+
+  ## Full path(s) to custom pattern files.
+  grok_custom_pattern_files = []
+
+  ## Custom patterns can also be defined here. Put one pattern per line.
+  grok_custom_patterns = '''
+  '''
+
+  ## Timezone allows you to provide an override for timestamps that
+  ## don't already include an offset
+  ## e.g. 04/06/2016 12:41:45 data one two 5.43µs
+  ##
+  ## Default: "" which renders UTC
+  ## Options are as follows:
+  ##   1. Local             -- interpret based on machine localtime
+  ##   2. "Canada/Eastern"  -- Unix TZ values like those found in https://en.wikipedia.org/wiki/List_of_tz_database_time_zones
+  ##   3. UTC               -- or blank/unspecified, will return timestamp in UTC
+  grok_timezone = "Canada/Eastern"
+
+  ## When set to "disable" timestamp will not incremented if there is a
+  ## duplicate.
+  # grok_unique_timestamp = "auto"
+
+  ## Enable multiline messages to be processed.
+  # grok_multiline = false
+```
+
+### Timestamp Examples
+
+This example input and config parses a file using a custom timestamp conversion:
+
+```text
+2017-02-21 13:10:34 value=42
+```
+
+```toml
+[[inputs.file]]
+  grok_patterns = ['%{TIMESTAMP_ISO8601:timestamp:ts-"2006-01-02 15:04:05"} value=%{NUMBER:value:int}']
+```
+
+This example input and config parses a file using a timestamp in unix time:
+
+```text
+1466004605 value=42
+1466004605.123456789 value=42
+```
+
+```toml
+[[inputs.file]]
+  grok_patterns = ['%{NUMBER:timestamp:ts-epoch} value=%{NUMBER:value:int}']
+```
+
+This example parses a file using a built-in conversion and a custom pattern:
+
+```text
+Wed Apr 12 13:10:34 PST 2017 value=42
+```
+
+```toml
+[[inputs.file]]
+  grok_patterns = ["%{TS_UNIX:timestamp:ts-unix} value=%{NUMBER:value:int}"]
+  grok_custom_patterns = '''
+    TS_UNIX %{DAY} %{MONTH} %{MONTHDAY} %{HOUR}:%{MINUTE}:%{SECOND} %{TZ} %{YEAR}
+  '''
+```
+
+This example input and config parses a file using a custom timestamp conversion
+that doesn't match any specific standard:
+
+```text
+21/02/2017 13:10:34 value=42
+```
+
+```toml
+[[inputs.file]]
+  grok_patterns = ['%{MY_TIMESTAMP:timestamp:ts-"02/01/2006 15:04:05"} value=%{NUMBER:value:int}']
+
+  grok_custom_patterns = '''
+    MY_TIMESTAMP (?:\d{2}.\d{2}.\d{4} \d{2}:\d{2}:\d{2})
+  '''
+```
+
+For cases where the timestamp itself is without offset, the `timezone` config
+var is available to denote an offset. By default (with `timezone` either omit,
+blank or set to `"UTC"`), the times are processed as if in the UTC timezone. If
+specified as `timezone = "Local"`, the timestamp will be processed based on the
+current machine timezone configuration. Lastly, if using a timezone from the
+list of Unix
+[timezones](https://en.wikipedia.org/wiki/List_of_tz_database_time_zones), grok
+will offset the timestamp accordingly.
+
+#### TOML Escaping
+
+When saving patterns to the configuration file, keep in mind the different TOML
+[string](https://github.com/toml-lang/toml#string) types and the escaping
+rules for each.  These escaping rules must be applied in addition to the
+escaping required by the grok syntax.  Using the Multi-line line literal
+syntax with `'''` may be useful.
+
+The following config examples will parse this input file:
+
+```text
+|42|\uD83D\uDC2F|'telegraf'|
+```
+
+Since `|` is a special character in the grok language, we must escape it to
+get a literal `|`.  With a basic TOML string, special characters such as
+backslash must be escaped, requiring us to escape the backslash a second time.
+
+```toml
+[[inputs.file]]
+  grok_patterns = ["\\|%{NUMBER:value:int}\\|%{UNICODE_ESCAPE:escape}\\|'%{WORD:name}'\\|"]
+  grok_custom_patterns = "UNICODE_ESCAPE (?:\\\\u[0-9A-F]{4})+"
+```
+
+We cannot use a literal TOML string for the pattern, because we cannot match a
+`'` within it.  However, it works well for the custom pattern.
+
+```toml
+[[inputs.file]]
+  grok_patterns = ["\\|%{NUMBER:value:int}\\|%{UNICODE_ESCAPE:escape}\\|'%{WORD:name}'\\|"]
+  grok_custom_patterns = 'UNICODE_ESCAPE (?:\\u[0-9A-F]{4})+'
+```
+
+A multi-line literal string allows us to encode the pattern:
+
+```toml
+[[inputs.file]]
+  grok_patterns = ['''
+    \|%{NUMBER:value:int}\|%{UNICODE_ESCAPE:escape}\|'%{WORD:name}'\|
+  ''']
+  grok_custom_patterns = 'UNICODE_ESCAPE (?:\\u[0-9A-F]{4})+'
+```
+
+#### Tips for creating patterns
+
+Writing complex patterns can be difficult, here is some advice for writing a new
+pattern or testing a pattern developed
+[online](https://grokdebug.herokuapp.com).
+
+Create a file output that writes to stdout, and disable other outputs while
+testing.  This will allow you to see the captured metrics.  Keep in mind that
+the file output will only print once per `flush_interval`.
+
+```toml
+[[outputs.file]]
+  files = ["stdout"]
+```
+
+- Start with a file containing only a single line of your input.
+- Remove all but the first token or piece of the line.
+- Add the section of your pattern to match this piece to your configuration file.
+- Verify that the metric is parsed successfully by running Telegraf.
+- If successful, add the next token, update the pattern and retest.
+- Continue one token at a time until the entire line is successfully parsed.
+
+#### Performance
+
+Performance depends heavily on the regular expressions that you use, but there
+are a few techniques that can help:
+
+- Avoid using patterns such as `%{DATA}` that will always match.
+- If possible, add `^` and `$` anchors to your pattern:
+
+  ```toml
+  [[inputs.file]]
+    grok_patterns = ["^%{COMBINED_LOG_FORMAT}$"]
+  ```
--- a/plugins/parsers/grok/influx_patterns.go
+++ b/plugins/parsers/grok/influx_patterns.go
@ -0,0 +1,43 @@
+package grok
+
+//nolint:lll // conditionally long lines allowed
+const DefaultPatterns = `
+# Example log file pattern, example log looks like this:
+#   [04/Jun/2016:12:41:45 +0100] 1.25 200 192.168.1.1 5.432µs
+# Breakdown of the DURATION pattern below:
+#   NUMBER  is a builtin logstash grok pattern matching float & int numbers.
+#   [nuµm]? is a regex specifying 0 or 1 of the characters within brackets.
+#   s       is also regex, this pattern must end in "s".
+# so DURATION will match something like '5.324ms' or '6.1µs' or '10s'
+DURATION %{NUMBER}[nuµm]?s
+RESPONSE_CODE %{NUMBER:response_code:tag}
+RESPONSE_TIME %{DURATION:response_time_ns:duration}
+EXAMPLE_LOG \[%{HTTPDATE:ts:ts-httpd}\] %{NUMBER:myfloat:float} %{RESPONSE_CODE} %{IPORHOST:clientip} %{RESPONSE_TIME}
+
+# Wider-ranging username matching vs. logstash built-in %{USER}
+NGUSERNAME [a-zA-Z0-9\.\@\-\+_%]+
+NGUSER %{NGUSERNAME}
+# Wider-ranging client IP matching
+CLIENT (?:%{IPV6}|%{IPV4}|%{HOSTNAME}|%{HOSTPORT})
+
+##
+## COMMON LOG PATTERNS
+##
+
+# apache & nginx logs, this is also known as the "common log format"
+#   see https://en.wikipedia.org/wiki/Common_Log_Format
+COMMON_LOG_FORMAT %{CLIENT:client_ip} %{NOTSPACE:ident} %{NOTSPACE:auth} \[%{HTTPDATE:ts:ts-httpd}\] "(?:%{WORD:verb:tag} %{NOTSPACE:request}(?: HTTP/%{NUMBER:http_version:float})?|%{DATA})" %{NUMBER:resp_code:tag} (?:%{NUMBER:resp_bytes:int}|-)
+
+# Combined log format is the same as the common log format but with the addition
+# of two quoted strings at the end for "referrer" and "agent"
+#   See Examples at http://httpd.apache.org/docs/current/mod/mod_log_config.html
+COMBINED_LOG_FORMAT %{COMMON_LOG_FORMAT} "%{DATA:referrer}" "%{DATA:agent}"
+
+# HTTPD log formats
+HTTPD20_ERRORLOG \[%{HTTPDERROR_DATE:timestamp}\] \[%{LOGLEVEL:loglevel:tag}\] (?:\[client %{IPORHOST:clientip}\] ){0,1}%{GREEDYDATA:errormsg}
+HTTPD24_ERRORLOG \[%{HTTPDERROR_DATE:timestamp}\] \[%{WORD:module}:%{LOGLEVEL:loglevel:tag}\] \[pid %{POSINT:pid:int}:tid %{NUMBER:tid:int}\]( \(%{POSINT:proxy_errorcode:int}\)%{DATA:proxy_errormessage}:)?( \[client %{IPORHOST:client}:%{POSINT:clientport}\])? %{DATA:errorcode}: %{GREEDYDATA:message}
+HTTPD_ERRORLOG %{HTTPD20_ERRORLOG}|%{HTTPD24_ERRORLOG}
+
+# DATA spanning multiple lines
+MULTILINEDATA (.|\n)*
+`
--- a/plugins/parsers/grok/parser.go
+++ b/plugins/parsers/grok/parser.go
@ -0,0 +1,603 @@
+package grok
+
+import (
+	"bufio"
+	"bytes"
+	"errors"
+	"fmt"
+	"os"
+	"regexp"
+	"strconv"
+	"strings"
+	"time"
+
+	"github.com/vjeantet/grok"
+
+	"github.com/influxdata/telegraf"
+	"github.com/influxdata/telegraf/internal"
+	"github.com/influxdata/telegraf/metric"
+	"github.com/influxdata/telegraf/plugins/parsers"
+)
+
+var timeLayouts = map[string]string{
+	"ts-ansic":       "Mon Jan _2 15:04:05 2006",
+	"ts-unix":        "Mon Jan _2 15:04:05 MST 2006",
+	"ts-ruby":        "Mon Jan 02 15:04:05 -0700 2006",
+	"ts-rfc822":      "02 Jan 06 15:04 MST",
+	"ts-rfc822z":     "02 Jan 06 15:04 -0700", // RFC822 with numeric zone
+	"ts-rfc850":      "Monday, 02-Jan-06 15:04:05 MST",
+	"ts-rfc1123":     "Mon, 02 Jan 2006 15:04:05 MST",
+	"ts-rfc1123z":    "Mon, 02 Jan 2006 15:04:05 -0700", // RFC1123 with numeric zone
+	"ts-rfc3339":     "2006-01-02T15:04:05Z07:00",
+	"ts-rfc3339nano": "2006-01-02T15:04:05.999999999Z07:00",
+	"ts-httpd":       "02/Jan/2006:15:04:05 -0700",
+	// These four are not exactly "layouts", but they are special cases that
+	// will get handled in the ParseLine function.
+	"ts-epoch":      "EPOCH",
+	"ts-epochnano":  "EPOCH_NANO",
+	"ts-epochmilli": "EPOCH_MILLI",
+	"ts-syslog":     "SYSLOG_TIMESTAMP",
+	"ts":            "GENERIC_TIMESTAMP", // try parsing all known timestamp layouts.
+}
+
+const (
+	Measurement      = "measurement"
+	Int              = "int"
+	Tag              = "tag"
+	Float            = "float"
+	String           = "string"
+	Duration         = "duration"
+	Drop             = "drop"
+	Epoch            = "EPOCH"
+	EpochMilli       = "EPOCH_MILLI"
+	EpochNano        = "EPOCH_NANO"
+	SyslogTimestamp  = "SYSLOG_TIMESTAMP"
+	GenericTimestamp = "GENERIC_TIMESTAMP"
+)
+
+var (
+	// matches named captures that contain a modifier.
+	//   ie,
+	//     %{NUMBER:bytes:int}
+	//     %{IPORHOST:clientip:tag}
+	//     %{HTTPDATE:ts1:ts-http}
+	//     %{HTTPDATE:ts2:ts-"02 Jan 06 15:04"}
+	modifierRe = regexp.MustCompile(`%{\w+:(\w+):(ts-".+"|t?s?-?\w+)}`)
+	// matches a plain pattern name. ie, %{NUMBER}
+	patternOnlyRe = regexp.MustCompile(`%{(\w+)}`)
+)
+
+// Parser is the primary struct to handle and grok-patterns defined in the config toml
+type Parser struct {
+	Patterns []string `toml:"grok_patterns"`
+	// namedPatterns is a list of internally-assigned names to the patterns
+	// specified by the user in Patterns.
+	// They will look like:
+	//   GROK_INTERNAL_PATTERN_0, GROK_INTERNAL_PATTERN_1, etc.
+	NamedPatterns      []string          `toml:"grok_named_patterns"`
+	CustomPatterns     string            `toml:"grok_custom_patterns"`
+	CustomPatternFiles []string          `toml:"grok_custom_pattern_files"`
+	Multiline          bool              `toml:"grok_multiline"`
+	Measurement        string            `toml:"-"`
+	DefaultTags        map[string]string `toml:"-"`
+	Log                telegraf.Logger   `toml:"-"`
+
+	// Timezone is an optional component to help render log dates to
+	// your chosen zone.
+	// Default: "" which renders UTC
+	// Options are as follows:
+	// 1. Local             -- interpret based on machine localtime
+	// 2. "America/Chicago" -- Unix TZ values like those found in https://en.wikipedia.org/wiki/List_of_tz_database_time_zones
+	// 3. UTC               -- or blank/unspecified, will return timestamp in UTC
+	Timezone string `toml:"grok_timezone"`
+	loc      *time.Location
+
+	// UniqueTimestamp when set to "disable", timestamp will not incremented if there is a duplicate.
+	UniqueTimestamp string `toml:"grok_unique_timestamp"`
+
+	// typeMap is a map of patterns -> capture name -> modifier,
+	//   ie, {
+	//          "%{TESTLOG}":
+	//             {
+	//                "bytes": "int",
+	//                "clientip": "tag"
+	//             }
+	//       }
+	typeMap map[string]map[string]string
+	// tsMap is a map of patterns -> capture name -> timestamp layout.
+	//   ie, {
+	//          "%{TESTLOG}":
+	//             {
+	//                "httptime": "02/Jan/2006:15:04:05 -0700"
+	//             }
+	//       }
+	tsMap map[string]map[string]string
+	// patternsMap is a map of all of the parsed patterns from CustomPatterns
+	// and CustomPatternFiles.
+	//   ie, {
+	//          "DURATION":      "%{NUMBER}[nuµm]?s"
+	//          "RESPONSE_CODE": "%{NUMBER:rc:tag}"
+	//       }
+	patternsMap map[string]string
+	// foundTSLayouts is a slice of timestamp patterns that have been found
+	// in the log lines. This slice gets updated if the user uses the generic
+	// 'ts' modifier for timestamps. This slice is checked first for matches,
+	// so that previously-matched layouts get priority over all other timestamp
+	// layouts.
+	foundTSLayouts []string
+
+	timeFunc func() time.Time
+	g        *grok.Grok
+	tsModder *tsModder
+}
+
+// Compile is a bound method to Parser which will process the options for our parser
+func (p *Parser) Compile() error {
+	p.typeMap = make(map[string]map[string]string)
+	p.tsMap = make(map[string]map[string]string)
+	p.patternsMap = make(map[string]string)
+	p.tsModder = &tsModder{}
+	var err error
+	p.g, err = grok.NewWithConfig(&grok.Config{NamedCapturesOnly: true})
+	if err != nil {
+		return err
+	}
+
+	if p.UniqueTimestamp == "" {
+		p.UniqueTimestamp = "auto"
+	}
+
+	// Give Patterns fake names so that they can be treated as named
+	// "custom patterns"
+	p.NamedPatterns = make([]string, 0, len(p.Patterns))
+	for i, pattern := range p.Patterns {
+		pattern = strings.TrimSpace(pattern)
+		if pattern == "" {
+			continue
+		}
+		name := fmt.Sprintf("GROK_INTERNAL_PATTERN_%d", i)
+		p.CustomPatterns += "\n" + name + " " + pattern + "\n"
+		p.NamedPatterns = append(p.NamedPatterns, "%{"+name+"}")
+	}
+
+	if len(p.NamedPatterns) == 0 {
+		return errors.New("pattern required")
+	}
+
+	// Combine user-supplied CustomPatterns with DEFAULT_PATTERNS and parse
+	// them together as the same type of pattern.
+	p.CustomPatterns = DefaultPatterns + p.CustomPatterns
+	if len(p.CustomPatterns) != 0 {
+		scanner := bufio.NewScanner(strings.NewReader(p.CustomPatterns))
+		p.addCustomPatterns(scanner)
+	}
+
+	// Parse any custom pattern files supplied.
+	for _, filename := range p.CustomPatternFiles {
+		file, fileErr := os.Open(filename)
+		if fileErr != nil {
+			return fileErr
+		}
+
+		scanner := bufio.NewScanner(bufio.NewReader(file))
+		p.addCustomPatterns(scanner)
+	}
+
+	p.loc, err = time.LoadLocation(p.Timezone)
+	if err != nil {
+		p.Log.Warnf("Improper timezone supplied (%s), setting loc to UTC", p.Timezone)
+		p.loc = time.UTC
+	}
+
+	if p.timeFunc == nil {
+		p.timeFunc = time.Now
+	}
+
+	return p.compileCustomPatterns()
+}
+
+// ParseLine is the primary function to process individual lines, returning the metrics
+func (p *Parser) ParseLine(line string) (telegraf.Metric, error) {
+	var err error
+	// values are the parsed fields from the log line
+	var values map[string]string
+	// the matching pattern string
+	var patternName string
+	for _, pattern := range p.NamedPatterns {
+		if values, err = p.g.Parse(pattern, line); err != nil {
+			return nil, err
+		}
+		if len(values) != 0 {
+			patternName = pattern
+			break
+		}
+	}
+
+	if len(values) == 0 {
+		p.Log.Debugf("Grok no match found for or no data extracted from: %q", line)
+		return nil, nil
+	}
+
+	fields := make(map[string]interface{})
+	tags := make(map[string]string)
+
+	// add default tags
+	for k, v := range p.DefaultTags {
+		tags[k] = v
+	}
+
+	timestamp := time.Now()
+	for k, v := range values {
+		if k == "" || v == "" {
+			continue
+		}
+		// t is the modifier of the field
+		var t string
+		// check if pattern has some modifiers
+		if types, ok := p.typeMap[patternName]; ok {
+			t = types[k]
+		}
+		// if we didn't find a modifier, check if we have a timestamp layout
+		if t == "" {
+			if ts, ok := p.tsMap[patternName]; ok {
+				// check if the modifier is a timestamp layout
+				if layout, ok := ts[k]; ok {
+					t = layout
+				}
+			}
+		}
+		// if we didn't find a type OR timestamp modifier, assume string
+		if t == "" {
+			t = String
+		}
+
+		switch t {
+		case Measurement:
+			p.Measurement = v
+		case Int:
+			iv, err := strconv.ParseInt(v, 0, 64)
+			if err != nil {
+				p.Log.Errorf("Error parsing %s to int: %s", v, err)
+			} else {
+				fields[k] = iv
+			}
+		case Float:
+			fv, err := strconv.ParseFloat(v, 64)
+			if err != nil {
+				p.Log.Errorf("Error parsing %s to float: %s", v, err)
+			} else {
+				fields[k] = fv
+			}
+		case Duration:
+			d, err := time.ParseDuration(v)
+			if err != nil {
+				p.Log.Errorf("Error parsing %s to duration: %s", v, err)
+			} else {
+				fields[k] = int64(d)
+			}
+		case Tag:
+			tags[k] = v
+		case String:
+			fields[k] = v
+		case Epoch:
+			parts := strings.SplitN(v, ".", 2)
+			if len(parts) == 0 {
+				p.Log.Errorf("Error parsing %s to timestamp: %s", v, err)
+				break
+			}
+
+			sec, err := strconv.ParseInt(parts[0], 10, 64)
+			if err != nil {
+				p.Log.Errorf("Error parsing %s to timestamp: %s", v, err)
+				break
+			}
+			ts := time.Unix(sec, 0)
+
+			if len(parts) == 2 {
+				padded := fmt.Sprintf("%-9s", parts[1])
+				nsString := strings.ReplaceAll(padded[:9], " ", "0")
+				nanosec, err := strconv.ParseInt(nsString, 10, 64)
+				if err != nil {
+					p.Log.Errorf("Error parsing %s to timestamp: %s", v, err)
+					break
+				}
+				ts = ts.Add(time.Duration(nanosec) * time.Nanosecond)
+			}
+			timestamp = ts
+		case EpochMilli:
+			ms, err := strconv.ParseInt(v, 10, 64)
+			if err != nil {
+				p.Log.Errorf("Error parsing %s to int: %s", v, err)
+			} else {
+				timestamp = time.Unix(0, ms*int64(time.Millisecond))
+			}
+		case EpochNano:
+			iv, err := strconv.ParseInt(v, 10, 64)
+			if err != nil {
+				p.Log.Errorf("Error parsing %s to int: %s", v, err)
+			} else {
+				timestamp = time.Unix(0, iv)
+			}
+		case SyslogTimestamp:
+			ts, err := internal.ParseTimestamp(time.Stamp, v, p.loc)
+			if err == nil {
+				if ts.Year() == 0 {
+					ts = ts.AddDate(timestamp.Year(), 0, 0)
+				}
+				timestamp = ts
+			} else {
+				p.Log.Errorf("Error parsing %s to time layout [%s]: %s", v, t, err)
+			}
+		case GenericTimestamp:
+			var foundTS bool
+			// first try timestamp layouts that we've already found
+			for _, layout := range p.foundTSLayouts {
+				ts, err := internal.ParseTimestamp(layout, v, p.loc)
+				if err == nil {
+					timestamp = ts
+					foundTS = true
+					break
+				}
+			}
+			// if we haven't found a timestamp layout yet, try all timestamp
+			// layouts.
+			if !foundTS {
+				for _, layout := range timeLayouts {
+					ts, err := internal.ParseTimestamp(layout, v, p.loc)
+					if err == nil {
+						timestamp = ts
+						foundTS = true
+						p.foundTSLayouts = append(p.foundTSLayouts, layout)
+						break
+					}
+				}
+			}
+			// if we still haven't found a timestamp layout, log it and we will
+			// just use time.Now()
+			if !foundTS {
+				p.Log.Errorf("Error parsing timestamp [%s], could not find any "+
+					"suitable time layouts.", v)
+			}
+		case Drop:
+		// goodbye!
+		default:
+			v = strings.ReplaceAll(v, ",", ".")
+			ts, err := internal.ParseTimestamp(t, v, p.loc)
+			if err == nil {
+				if ts.Year() == 0 {
+					ts = ts.AddDate(timestamp.Year(), 0, 0)
+				}
+				timestamp = ts
+			} else {
+				p.Log.Errorf("Error parsing %s to time layout [%s]: %s", v, t, err)
+			}
+		}
+	}
+
+	if p.UniqueTimestamp != "auto" {
+		return metric.New(p.Measurement, tags, fields, timestamp), nil
+	}
+
+	return metric.New(p.Measurement, tags, fields, p.tsModder.tsMod(timestamp)), nil
+}
+
+func (p *Parser) Parse(buf []byte) ([]telegraf.Metric, error) {
+	metrics := make([]telegraf.Metric, 0)
+
+	if p.Multiline {
+		m, err := p.ParseLine(string(buf))
+		if err != nil {
+			return nil, err
+		}
+		if m != nil {
+			metrics = append(metrics, m)
+		}
+		return metrics, nil
+	}
+
+	scanner := bufio.NewScanner(bytes.NewReader(buf))
+	for scanner.Scan() {
+		line := scanner.Text()
+		m, err := p.ParseLine(line)
+		if err != nil {
+			return nil, err
+		}
+
+		if m == nil {
+			continue
+		}
+		metrics = append(metrics, m)
+	}
+
+	return metrics, nil
+}
+
+func (p *Parser) SetDefaultTags(tags map[string]string) {
+	p.DefaultTags = tags
+}
+
+func (p *Parser) addCustomPatterns(scanner *bufio.Scanner) {
+	for scanner.Scan() {
+		line := strings.TrimSpace(scanner.Text())
+		if len(line) > 0 && line[0] != '#' {
+			names := strings.SplitN(line, " ", 2)
+			p.patternsMap[names[0]] = names[1]
+		}
+	}
+}
+
+func (p *Parser) compileCustomPatterns() error {
+	var err error
+	// check if the pattern contains a subpattern that is already defined
+	// replace it with the subpattern for modifier inheritance.
+	for i := 0; i < 2; i++ {
+		for name, pattern := range p.patternsMap {
+			subNames := patternOnlyRe.FindAllStringSubmatch(pattern, -1)
+			for _, subName := range subNames {
+				if subPattern, ok := p.patternsMap[subName[1]]; ok {
+					pattern = strings.Replace(pattern, subName[0], subPattern, 1)
+				}
+			}
+			p.patternsMap[name] = pattern
+		}
+	}
+
+	// check if pattern contains modifiers. Parse them out if it does.
+	for name, pattern := range p.patternsMap {
+		if modifierRe.MatchString(pattern) {
+			// this pattern has modifiers, so parse out the modifiers
+			pattern, err = p.parseTypedCaptures(name, pattern)
+			if err != nil {
+				return err
+			}
+			p.patternsMap[name] = pattern
+		}
+	}
+
+	return p.g.AddPatternsFromMap(p.patternsMap)
+}
+
+// parseTypedCaptures parses the capture modifiers, and then deletes the
+// modifier from the line so that it is a valid "grok" pattern again.
+//
+//	ie,
+//	  %{NUMBER:bytes:int}      => %{NUMBER:bytes}      (stores %{NUMBER}->bytes->int)
+//	  %{IPORHOST:clientip:tag} => %{IPORHOST:clientip} (stores %{IPORHOST}->clientip->tag)
+func (p *Parser) parseTypedCaptures(name, pattern string) (string, error) {
+	matches := modifierRe.FindAllStringSubmatch(pattern, -1)
+
+	// grab the name of the capture pattern
+	patternName := "%{" + name + "}"
+	// create type map for this pattern
+	p.typeMap[patternName] = make(map[string]string)
+	p.tsMap[patternName] = make(map[string]string)
+
+	// boolean to verify that each pattern only has a single ts- data type.
+	hasTimestamp := false
+	for _, match := range matches {
+		// regex capture 1 is the name of the capture
+		// regex capture 2 is the modifier of the capture
+		if strings.HasPrefix(match[2], "ts") {
+			if hasTimestamp {
+				return pattern, fmt.Errorf("logparser pattern compile error: "+
+					"Each pattern is allowed only one named "+
+					"timestamp data type. pattern: %s", pattern)
+			}
+			if layout, ok := timeLayouts[match[2]]; ok {
+				// built-in time format
+				p.tsMap[patternName][match[1]] = layout
+			} else {
+				// custom time format
+				p.tsMap[patternName][match[1]] = strings.TrimSuffix(strings.TrimPrefix(match[2], `ts-"`), `"`)
+			}
+			hasTimestamp = true
+		} else {
+			p.typeMap[patternName][match[1]] = match[2]
+		}
+
+		// the modifier is not a valid part of a "grok" pattern, so remove it
+		// from the pattern.
+		pattern = strings.Replace(pattern, ":"+match[2]+"}", "}", 1)
+	}
+
+	return pattern, nil
+}
+
+// tsModder is a struct for incrementing identical timestamps of log lines
+// so that we don't push identical metrics that will get overwritten.
+type tsModder struct {
+	dupe     time.Time
+	last     time.Time
+	incr     time.Duration
+	incrn    time.Duration
+	rollover time.Duration
+}
+
+// tsMod increments the given timestamp one unit more from the previous
+// duplicate timestamp.
+// the increment unit is determined as the next smallest time unit below the
+// most significant time unit of ts.
+//
+//	ie, if the input is at ms precision, it will increment it 1µs.
+func (t *tsModder) tsMod(ts time.Time) time.Time {
+	if ts.IsZero() {
+		return ts
+	}
+	defer func() { t.last = ts }()
+	// don't mod the time if we don't need to
+	if t.last.IsZero() || ts.IsZero() {
+		t.incrn = 0
+		t.rollover = 0
+		return ts
+	}
+	if !ts.Equal(t.last) && !ts.Equal(t.dupe) {
+		t.incr = 0
+		t.incrn = 0
+		t.rollover = 0
+		return ts
+	}
+	if ts.Equal(t.last) {
+		t.dupe = ts
+	}
+
+	if ts.Equal(t.dupe) && t.incr == time.Duration(0) {
+		tsNano := ts.UnixNano()
+
+		d := int64(10)
+		counter := 1
+		for {
+			a := tsNano % d
+			if a > 0 {
+				break
+			}
+			d = d * 10
+			counter++
+		}
+
+		switch {
+		case counter <= 6:
+			t.incr = time.Nanosecond
+		case counter <= 9:
+			t.incr = time.Microsecond
+		case counter > 9:
+			t.incr = time.Millisecond
+		}
+	}
+
+	t.incrn++
+	if t.incrn == 999 && t.incr > time.Nanosecond {
+		t.rollover = t.incr * t.incrn
+		t.incrn = 1
+		t.incr = t.incr / 1000
+		if t.incr < time.Nanosecond {
+			t.incr = time.Nanosecond
+		}
+	}
+	return ts.Add(t.incr*t.incrn + t.rollover)
+}
+
+func (p *Parser) Init() error {
+	if len(p.Patterns) == 0 {
+		p.Patterns = []string{"%{COMBINED_LOG_FORMAT}"}
+	}
+
+	if p.UniqueTimestamp == "" {
+		p.UniqueTimestamp = "auto"
+	}
+
+	if p.Timezone == "" {
+		p.Timezone = "UTC"
+	}
+
+	return p.Compile()
+}
+
+func init() {
+	parsers.Add("grok",
+		func(defaultMetricName string) telegraf.Parser {
+			return &Parser{
+				Measurement: defaultMetricName,
+			}
+		},
+	)
+}
--- a/plugins/parsers/grok/parser_test.go
+++ b/plugins/parsers/grok/parser_test.go
--- a/plugins/parsers/grok/testdata/test-patterns
+++ b/plugins/parsers/grok/testdata/test-patterns
@ -0,0 +1,14 @@
+# Test A log line:
+#   [04/Jun/2016:12:41:45 +0100] 1.25 200 192.168.1.1 5.432µs 101
+DURATION %{NUMBER}[nuµm]?s
+RESPONSE_CODE %{NUMBER:response_code:tag}
+RESPONSE_TIME %{DURATION:response_time:duration}
+TEST_LOG_A \[%{HTTPDATE:timestamp:ts-httpd}\] %{NUMBER:myfloat:float} %{RESPONSE_CODE} %{IPORHOST:clientip} %{RESPONSE_TIME} %{NUMBER:myint:int}
+
+# Test B log line:
+#   [04/06/2016--12:41:45] 1.25 mystring dropme nomodifier
+TEST_TIMESTAMP %{MONTHDAY}/%{MONTHNUM}/%{YEAR}--%{TIME}
+TEST_LOG_B \[%{TEST_TIMESTAMP:timestamp:ts-"02/01/2006--15:04:05"}\] %{NUMBER:myfloat:float} %{WORD:mystring:string} %{WORD:dropme:drop} %{WORD:nomodifier}
+
+TEST_TIMESTAMP %{MONTHDAY}/%{MONTHNUM}/%{YEAR}--%{TIME}
+TEST_LOG_BAD \[%{TEST_TIMESTAMP:timestamp:ts-"02/01/2006--15:04:05"}\] %{NUMBER:myfloat:float} %{WORD:mystring:int} %{WORD:dropme:drop} %{WORD:nomodifier}
--- a/plugins/parsers/grok/testdata/test_a.log
+++ b/plugins/parsers/grok/testdata/test_a.log
@ -0,0 +1 @@
+[04/Jun/2016:12:41:45 +0100] 1.25 200 192.168.1.1 5.432µs 101
--- a/plugins/parsers/grok/testdata/test_b.log
+++ b/plugins/parsers/grok/testdata/test_b.log
@ -0,0 +1 @@
+[04/06/2016--12:41:45] 1.25 mystring dropme nomodifier
--- a/plugins/parsers/grok/testdata/test_multiline.log
+++ b/plugins/parsers/grok/testdata/test_multiline.log
@ -0,0 +1,3 @@
+2022-12-01T12:41:45Z Error A long and
+    multiline
+    message
				`@ -0,0 +1 @@`
				`[04/Jun/2016:12:41:45 +0100] 1.25 200 192.168.1.1 5.432µs 101`
				`@ -0,0 +1 @@`
				`[04/06/2016--12:41:45] 1.25 mystring dropme nomodifier`