1
0
Fork 0

Adding upstream version 1.34.4.

Signed-off-by: Daniel Baumann <daniel@debian.org>
This commit is contained in:
Daniel Baumann 2025-05-24 07:26:29 +02:00
parent e393c3af3f
commit 4978089aab
Signed by: daniel
GPG key ID: FBB4F0E80A80222F
4963 changed files with 677545 additions and 0 deletions

View file

@ -0,0 +1,141 @@
# Tail Input Plugin
The tail plugin "tails" a logfile and parses each log message.
By default, the tail plugin acts like the following unix tail command:
```shell
tail -F --lines=0 myfile.log
```
- `-F` means that it will follow the _name_ of the given file, so
that it will be compatible with log-rotated files, and that it will retry on
inaccessible files.
- `--lines=0` means that it will start at the end of the file (unless
the `initial_read_offset` option is set).
see <http://man7.org/linux/man-pages/man1/tail.1.html> for more details.
The plugin expects messages in one of the [Telegraf Input Data
Formats](../../../docs/DATA_FORMATS_INPUT.md).
## Service Input <!-- @/docs/includes/service_input.md -->
This plugin is a service input. Normal plugins gather metrics determined by the
interval setting. Service plugins start a service to listen and wait for
metrics or events to occur. Service plugins have two key differences from
normal plugins:
1. The global or plugin specific `interval` setting may not apply
2. The CLI options of `--test`, `--test-wait`, and `--once` may not produce
output for this plugin
## Global configuration options <!-- @/docs/includes/plugin_config.md -->
In addition to the plugin-specific configuration settings, plugins support
additional global and plugin configuration settings. These settings are used to
modify metrics, tags, and field or create aliases and configure ordering, etc.
See the [CONFIGURATION.md][CONFIGURATION.md] for more details.
[CONFIGURATION.md]: ../../../docs/CONFIGURATION.md#plugins
## Configuration
```toml @sample.conf
# Parse the new lines appended to a file
[[inputs.tail]]
## File names or a pattern to tail.
## These accept standard unix glob matching rules, but with the addition of
## ** as a "super asterisk". ie:
## "/var/log/**.log" -> recursively find all .log files in /var/log
## "/var/log/*/*.log" -> find all .log files with a parent dir in /var/log
## "/var/log/apache.log" -> just tail the apache log file
## "/var/log/log[!1-2]* -> tail files without 1-2
## "/var/log/log[^1-2]* -> identical behavior as above
## See https://github.com/gobwas/glob for more examples
##
files = ["/var/mymetrics.out"]
## Offset to start reading at
## The following methods are available:
## beginning -- start reading from the beginning of the file ignoring any persisted offset
## end -- start reading from the end of the file ignoring any persisted offset
## saved-or-beginning -- use the persisted offset of the file or, if no offset persisted, start from the beginning of the file
## saved-or-end -- use the persisted offset of the file or, if no offset persisted, start from the end of the file
# initial_read_offset = "saved-or-end"
## Whether file is a named pipe
# pipe = false
## Method used to watch for file updates. Can be either "inotify" or "poll".
## inotify is supported on linux, *bsd, and macOS, while Windows requires
## using poll. Poll checks for changes every 250ms.
# watch_method = "inotify"
## Maximum lines of the file to process that have not yet be written by the
## output. For best throughput set based on the number of metrics on each
## line and the size of the output's metric_batch_size.
# max_undelivered_lines = 1000
## Character encoding to use when interpreting the file contents. Invalid
## characters are replaced using the unicode replacement character. When set
## to the empty string the data is not decoded to text.
## ex: character_encoding = "utf-8"
## character_encoding = "utf-16le"
## character_encoding = "utf-16be"
## character_encoding = ""
# character_encoding = ""
## Data format to consume.
## Each data format has its own unique set of configuration options, read
## more about them here:
## https://github.com/influxdata/telegraf/blob/master/docs/DATA_FORMATS_INPUT.md
data_format = "influx"
## Set the tag that will contain the path of the tailed file. If you don't want this tag, set it to an empty string.
# path_tag = "path"
## Filters to apply to files before generating metrics
## "ansi_color" removes ANSI colors
# filters = []
## multiline parser/codec
## https://www.elastic.co/guide/en/logstash/2.4/plugins-filters-multiline.html
#[inputs.tail.multiline]
## The pattern should be a regexp which matches what you believe to be an indicator that the field is part of an event consisting of multiple lines of log data.
#pattern = "^\s"
## The field's value must be previous or next and indicates the relation to the
## multi-line event.
#match_which_line = "previous"
## The invert_match can be true or false (defaults to false).
## If true, a message not matching the pattern will constitute a match of the multiline filter and the what will be applied. (vice-versa is also true)
#invert_match = false
## The handling method for quoted text (defaults to 'ignore').
## The following methods are available:
## ignore -- do not consider quotation (default)
## single-quotes -- consider text quoted by single quotes (')
## double-quotes -- consider text quoted by double quotes (")
## backticks -- consider text quoted by backticks (`)
## When handling quotes, escaped quotes (e.g. \") are handled correctly.
#quotation = "ignore"
## The preserve_newline option can be true or false (defaults to false).
## If true, the newline character is preserved for multiline elements,
## this is useful to preserve message-structure e.g. for logging outputs.
#preserve_newline = false
#After the specified timeout, this plugin sends the multiline event even if no new pattern is found to start a new event. The default is 5s.
#timeout = 5s
```
## Metrics
Metrics are produced according to the `data_format` option. Additionally a
tag labeled `path` is added to the metric containing the filename being tailed.
## Example Output
There is no predefined metric format, so output depends on plugin input.

View file

@ -0,0 +1,194 @@
package tail
import (
"bytes"
"errors"
"regexp"
"strings"
"time"
"github.com/influxdata/telegraf/config"
)
const (
// previous => Append current line to previous line
previous multilineMatchWhichLine = iota
// next => next line will be appended to current line
next
)
// Indicates relation to the multiline event: previous or next
type multilineMatchWhichLine int
type multiline struct {
config *multilineConfig
enabled bool
patternRegexp *regexp.Regexp
quote byte
inQuote bool
}
type multilineConfig struct {
Pattern string `toml:"pattern"`
MatchWhichLine multilineMatchWhichLine `toml:"match_which_line"`
InvertMatch bool `toml:"invert_match"`
PreserveNewline bool `toml:"preserve_newline"`
Quotation string `toml:"quotation"`
Timeout *config.Duration `toml:"timeout"`
}
func (m *multiline) isEnabled() bool {
return m.enabled
}
func (m *multiline) processLine(text string, buffer *bytes.Buffer) string {
if m.matchQuotation(text) || m.matchString(text) {
// Restore the newline removed by tail's scanner
if buffer.Len() > 0 && m.config.PreserveNewline {
buffer.WriteString("\n")
}
buffer.WriteString(text)
return ""
}
if m.config.MatchWhichLine == previous {
previousText := buffer.String()
buffer.Reset()
buffer.WriteString(text)
text = previousText
} else {
// next
if buffer.Len() > 0 {
if m.config.PreserveNewline {
buffer.WriteString("\n")
}
buffer.WriteString(text)
text = buffer.String()
buffer.Reset()
}
}
return text
}
func (m *multiline) matchQuotation(text string) bool {
if m.config.Quotation == "ignore" {
return false
}
escaped := 0
count := 0
for i := 0; i < len(text); i++ {
if text[i] == '\\' {
escaped++
continue
}
// If we do encounter a backslash-quote combination, we interpret this
// as an escaped-quoted and should not count the quote. However,
// backslash-backslash combinations (or any even number of backslashes)
// are interpreted as a literal backslash not escaping the quote.
if text[i] == m.quote && escaped%2 == 0 {
count++
}
// If we encounter any non-quote, non-backslash character we can
// safely reset the escape state.
escaped = 0
}
even := count%2 == 0
m.inQuote = (m.inQuote && even) || (!m.inQuote && !even)
return m.inQuote
}
func (m *multiline) matchString(text string) bool {
if m.patternRegexp != nil {
return m.patternRegexp.MatchString(text) != m.config.InvertMatch
}
return false
}
func (m *multilineConfig) newMultiline() (*multiline, error) {
var r *regexp.Regexp
if m.Pattern != "" {
var err error
if r, err = regexp.Compile(m.Pattern); err != nil {
return nil, err
}
}
var quote byte
switch m.Quotation {
case "", "ignore":
m.Quotation = "ignore"
case "single-quotes":
quote = '\''
case "double-quotes":
quote = '"'
case "backticks":
quote = '`'
default:
return nil, errors.New("invalid 'quotation' setting")
}
enabled := m.Pattern != "" || quote != 0
if m.Timeout == nil || time.Duration(*m.Timeout).Nanoseconds() == int64(0) {
d := config.Duration(5 * time.Second)
m.Timeout = &d
}
return &multiline{
config: m,
enabled: enabled,
patternRegexp: r,
quote: quote,
}, nil
}
func flush(buffer *bytes.Buffer) string {
if buffer.Len() == 0 {
return ""
}
text := buffer.String()
buffer.Reset()
return text
}
func (w multilineMatchWhichLine) String() string {
switch w {
case previous:
return "previous"
case next:
return "next"
}
return ""
}
// UnmarshalTOML implements ability to unmarshal multilineMatchWhichLine from TOML files.
func (w *multilineMatchWhichLine) UnmarshalTOML(data []byte) (err error) {
return w.UnmarshalText(data)
}
// UnmarshalText implements encoding.TextUnmarshaler
func (w *multilineMatchWhichLine) UnmarshalText(data []byte) (err error) {
s := string(data)
switch strings.ToUpper(s) {
case `PREVIOUS`, `"PREVIOUS"`, `'PREVIOUS'`:
*w = previous
return nil
case `NEXT`, `"NEXT"`, `'NEXT'`:
*w = next
return nil
}
*w = -1
return errors.New("unknown multiline MatchWhichLine")
}
// MarshalText implements encoding.TextMarshaler
func (w multilineMatchWhichLine) MarshalText() ([]byte, error) {
s := w.String()
if s != "" {
return []byte(s), nil
}
return nil, errors.New("unknown multiline MatchWhichLine")
}

View file

@ -0,0 +1,475 @@
package tail
import (
"bufio"
"bytes"
"fmt"
"os"
"path/filepath"
"testing"
"time"
"github.com/stretchr/testify/require"
"github.com/influxdata/telegraf/config"
)
func TestMultilineConfigOK(t *testing.T) {
c := &multilineConfig{
Pattern: ".*",
MatchWhichLine: previous,
}
_, err := c.newMultiline()
require.NoError(t, err, "Configuration was OK.")
}
func TestMultilineConfigError(t *testing.T) {
c := &multilineConfig{
Pattern: "\xA0",
MatchWhichLine: previous,
}
_, err := c.newMultiline()
require.Error(t, err, "The pattern was invalid")
}
func TestMultilineConfigTimeoutSpecified(t *testing.T) {
duration := config.Duration(10 * time.Second)
c := &multilineConfig{
Pattern: ".*",
MatchWhichLine: previous,
Timeout: &duration,
}
m, err := c.newMultiline()
require.NoError(t, err, "Configuration was OK.")
require.Equal(t, duration, *m.config.Timeout)
}
func TestMultilineConfigDefaultTimeout(t *testing.T) {
duration := config.Duration(5 * time.Second)
c := &multilineConfig{
Pattern: ".*",
MatchWhichLine: previous,
}
m, err := c.newMultiline()
require.NoError(t, err, "Configuration was OK.")
require.Equal(t, duration, *m.config.Timeout)
}
func TestMultilineIsEnabled(t *testing.T) {
c := &multilineConfig{
Pattern: ".*",
MatchWhichLine: previous,
}
m, err := c.newMultiline()
require.NoError(t, err, "Configuration was OK.")
isEnabled := m.isEnabled()
require.True(t, isEnabled, "Should have been enabled")
}
func TestMultilineIsDisabled(t *testing.T) {
c := &multilineConfig{
MatchWhichLine: previous,
}
m, err := c.newMultiline()
require.NoError(t, err, "Configuration was OK.")
isEnabled := m.isEnabled()
require.False(t, isEnabled, "Should have been disabled")
}
func TestMultilineFlushEmpty(t *testing.T) {
var buffer bytes.Buffer
text := flush(&buffer)
require.Empty(t, text)
}
func TestMultilineFlush(t *testing.T) {
var buffer bytes.Buffer
buffer.WriteString("foo")
text := flush(&buffer)
require.Equal(t, "foo", text)
require.Zero(t, buffer.Len())
}
func TestMultiLineProcessLinePrevious(t *testing.T) {
c := &multilineConfig{
Pattern: "^=>",
MatchWhichLine: previous,
}
m, err := c.newMultiline()
require.NoError(t, err, "Configuration was OK.")
var buffer bytes.Buffer
text := m.processLine("1", &buffer)
require.Empty(t, text)
require.NotZero(t, buffer.Len())
text = m.processLine("=>2", &buffer)
require.Empty(t, text)
require.NotZero(t, buffer.Len())
text = m.processLine("=>3", &buffer)
require.Empty(t, text)
require.NotZero(t, buffer.Len())
text = m.processLine("4", &buffer)
require.Equal(t, "1=>2=>3", text)
require.NotZero(t, buffer.Len())
text = m.processLine("5", &buffer)
require.Equal(t, "4", text)
require.Equal(t, "5", buffer.String())
}
func TestMultiLineProcessLineNext(t *testing.T) {
c := &multilineConfig{
Pattern: "=>$",
MatchWhichLine: next,
}
m, err := c.newMultiline()
require.NoError(t, err, "Configuration was OK.")
var buffer bytes.Buffer
text := m.processLine("1=>", &buffer)
require.Empty(t, text)
require.NotZero(t, buffer.Len())
text = m.processLine("2=>", &buffer)
require.Empty(t, text)
require.NotZero(t, buffer.Len())
text = m.processLine("3=>", &buffer)
require.Empty(t, text)
require.NotZero(t, buffer.Len())
text = m.processLine("4", &buffer)
require.Equal(t, "1=>2=>3=>4", text)
require.Zero(t, buffer.Len())
text = m.processLine("5", &buffer)
require.Equal(t, "5", text)
require.Zero(t, buffer.Len())
}
func TestMultiLineMatchStringWithInvertMatchFalse(t *testing.T) {
c := &multilineConfig{
Pattern: "=>$",
MatchWhichLine: next,
InvertMatch: false,
}
m, err := c.newMultiline()
require.NoError(t, err, "Configuration was OK.")
matches1 := m.matchString("t=>")
matches2 := m.matchString("t")
require.True(t, matches1)
require.False(t, matches2)
}
func TestMultiLineMatchStringWithInvertTrue(t *testing.T) {
c := &multilineConfig{
Pattern: "=>$",
MatchWhichLine: next,
InvertMatch: true,
}
m, err := c.newMultiline()
require.NoError(t, err, "Configuration was OK.")
matches1 := m.matchString("t=>")
matches2 := m.matchString("t")
require.False(t, matches1)
require.True(t, matches2)
}
func TestMultilineWhat(t *testing.T) {
var w1 multilineMatchWhichLine
require.NoError(t, w1.UnmarshalTOML([]byte(`"previous"`)))
require.Equal(t, previous, w1)
var w2 multilineMatchWhichLine
require.NoError(t, w2.UnmarshalTOML([]byte(`previous`)))
require.Equal(t, previous, w2)
var w3 multilineMatchWhichLine
require.NoError(t, w3.UnmarshalTOML([]byte(`'previous'`)))
require.Equal(t, previous, w3)
var w4 multilineMatchWhichLine
require.NoError(t, w4.UnmarshalTOML([]byte(`"next"`)))
require.Equal(t, next, w4)
var w5 multilineMatchWhichLine
require.NoError(t, w5.UnmarshalTOML([]byte(`next`)))
require.Equal(t, next, w5)
var w6 multilineMatchWhichLine
require.NoError(t, w6.UnmarshalTOML([]byte(`'next'`)))
require.Equal(t, next, w6)
var w7 multilineMatchWhichLine
require.Error(t, w7.UnmarshalTOML([]byte(`nope`)))
require.Equal(t, multilineMatchWhichLine(-1), w7)
}
func TestMultilineQuoted(t *testing.T) {
tests := []struct {
name string
quotation string
quote string
filename string
}{
{
name: "single-quotes",
quotation: "single-quotes",
quote: `'`,
filename: "multiline_quoted_single.csv",
},
{
name: "double-quotes",
quotation: "double-quotes",
quote: `"`,
filename: "multiline_quoted_double.csv",
},
{
name: "backticks",
quotation: "backticks",
quote: "`",
filename: "multiline_quoted_backticks.csv",
},
}
for _, tt := range tests {
t.Run(tt.name, func(t *testing.T) {
expected := []string{
`1660819827410,1,some text without quotes,A`,
fmt.Sprintf("1660819827411,1,%ssome text all quoted%s,A", tt.quote, tt.quote),
fmt.Sprintf("1660819827412,1,%ssome text all quoted\nbut wrapped%s,A", tt.quote, tt.quote),
fmt.Sprintf("1660819827420,2,some text with %squotes%s,B", tt.quote, tt.quote),
"1660819827430,3,some text with 'multiple \"quotes\" in `one` line',C",
fmt.Sprintf("1660819827440,4,some multiline text with %squotes\n", tt.quote) +
fmt.Sprintf("spanning \\%smultiple\\%s\n", tt.quote, tt.quote) +
fmt.Sprintf("lines%s but do not %send\ndirectly%s,D", tt.quote, tt.quote, tt.quote),
fmt.Sprintf("1660819827450,5,all of %sthis%s should %sbasically%s work...,E", tt.quote, tt.quote, tt.quote, tt.quote),
}
c := &multilineConfig{
MatchWhichLine: next,
Quotation: tt.quotation,
PreserveNewline: true,
}
m, err := c.newMultiline()
require.NoError(t, err)
f, err := os.Open(filepath.Join("testdata", tt.filename))
require.NoError(t, err)
scanner := bufio.NewScanner(f)
var buffer bytes.Buffer
var result []string
for scanner.Scan() {
line := scanner.Text()
text := m.processLine(line, &buffer)
if text == "" {
continue
}
result = append(result, text)
}
if text := flush(&buffer); text != "" {
result = append(result, text)
}
require.EqualValues(t, expected, result)
})
}
}
func TestMultilineQuotedError(t *testing.T) {
tests := []struct {
name string
filename string
quotation string
quote string
expected []string
}{
{
name: "messed up quoting",
filename: "multiline_quoted_messed_up.csv",
quotation: "single-quotes",
quote: `'`,
expected: []string{
"1660819827410,1,some text without quotes,A",
"1660819827411,1,'some text all quoted,A\n1660819827412,1,'some text all quoted",
"but wrapped,A"},
},
{
name: "missing closing quote",
filename: "multiline_quoted_missing_close.csv",
quotation: "single-quotes",
quote: `'`,
expected: []string{"1660819827411,2,'some text all quoted,B\n1660819827410,1,some text without quotes,A"},
},
}
for _, tt := range tests {
t.Run(tt.name, func(t *testing.T) {
c := &multilineConfig{
MatchWhichLine: next,
Quotation: tt.quotation,
PreserveNewline: true,
}
m, err := c.newMultiline()
require.NoError(t, err)
f, err := os.Open(filepath.Join("testdata", tt.filename))
require.NoError(t, err)
scanner := bufio.NewScanner(f)
var buffer bytes.Buffer
var result []string
for scanner.Scan() {
line := scanner.Text()
text := m.processLine(line, &buffer)
if text == "" {
continue
}
result = append(result, text)
}
if text := flush(&buffer); text != "" {
result = append(result, text)
}
require.EqualValues(t, tt.expected, result)
})
}
}
func TestMultilineNewline(t *testing.T) {
tests := []struct {
name string
filename string
cfg *multilineConfig
expected []string
}{
{
name: "do not preserve newline",
cfg: &multilineConfig{
Pattern: `\[[0-9]{2}/[A-Za-z]{3}/[0-9]{4}:[0-9]{2}:[0-9]{2}:[0-9]{2} \+[0-9]{4}\]`,
InvertMatch: true,
},
filename: "test_multiline.log",
expected: []string{
`[04/Jun/2016:12:41:45 +0100] DEBUG HelloExample: This is debug`,
`[04/Jun/2016:12:41:48 +0100] INFO HelloExample: This is info`,
"[04/Jun/2016:12:41:46 +0100] ERROR HelloExample: Sorry, something wrong! " +
"java.lang.ArithmeticException: / by zero" +
"\tat com.foo.HelloExample2.divide(HelloExample2.java:24)" +
"\tat com.foo.HelloExample2.main(HelloExample2.java:14)",
`[04/Jun/2016:12:41:48 +0100] WARN HelloExample: This is warn`,
},
},
{
name: "preserve newline",
cfg: &multilineConfig{
Pattern: `\[[0-9]{2}/[A-Za-z]{3}/[0-9]{4}:[0-9]{2}:[0-9]{2}:[0-9]{2} \+[0-9]{4}\]`,
InvertMatch: true,
PreserveNewline: true,
},
filename: "test_multiline.log",
expected: []string{
`[04/Jun/2016:12:41:45 +0100] DEBUG HelloExample: This is debug`,
`[04/Jun/2016:12:41:48 +0100] INFO HelloExample: This is info`,
`[04/Jun/2016:12:41:46 +0100] ERROR HelloExample: Sorry, something wrong!` + ` ` + `
java.lang.ArithmeticException: / by zero
at com.foo.HelloExample2.divide(HelloExample2.java:24)
at com.foo.HelloExample2.main(HelloExample2.java:14)`,
`[04/Jun/2016:12:41:48 +0100] WARN HelloExample: This is warn`,
},
},
}
for _, tt := range tests {
t.Run(tt.name, func(t *testing.T) {
m, err := tt.cfg.newMultiline()
require.NoError(t, err)
f, err := os.Open(filepath.Join("testdata", tt.filename))
require.NoError(t, err)
scanner := bufio.NewScanner(f)
var buffer bytes.Buffer
var result []string
for scanner.Scan() {
line := scanner.Text()
text := m.processLine(line, &buffer)
if text == "" {
continue
}
result = append(result, text)
}
if text := flush(&buffer); text != "" {
result = append(result, text)
}
require.EqualValues(t, tt.expected, result)
})
}
}
func TestMultiLineQuotedAndPattern(t *testing.T) {
c := &multilineConfig{
Pattern: "=>$",
MatchWhichLine: next,
Quotation: "double-quotes",
PreserveNewline: true,
}
m, err := c.newMultiline()
require.NoError(t, err, "Configuration was OK.")
var buffer bytes.Buffer
text := m.processLine("1=>", &buffer)
require.Empty(t, text)
require.NotZero(t, buffer.Len())
text = m.processLine("2=>", &buffer)
require.Empty(t, text)
require.NotZero(t, buffer.Len())
text = m.processLine(`"a quoted`, &buffer)
require.Empty(t, text)
require.NotZero(t, buffer.Len())
text = m.processLine(`multiline string"=>`, &buffer)
require.Empty(t, text)
require.NotZero(t, buffer.Len())
text = m.processLine("3=>", &buffer)
require.Empty(t, text)
require.NotZero(t, buffer.Len())
text = m.processLine("4", &buffer)
require.Equal(t, "1=>\n2=>\n\"a quoted\nmultiline string\"=>\n3=>\n4", text)
require.Zero(t, buffer.Len())
text = m.processLine("5", &buffer)
require.Equal(t, "5", text)
require.Zero(t, buffer.Len())
}

View file

@ -0,0 +1,87 @@
# Parse the new lines appended to a file
[[inputs.tail]]
## File names or a pattern to tail.
## These accept standard unix glob matching rules, but with the addition of
## ** as a "super asterisk". ie:
## "/var/log/**.log" -> recursively find all .log files in /var/log
## "/var/log/*/*.log" -> find all .log files with a parent dir in /var/log
## "/var/log/apache.log" -> just tail the apache log file
## "/var/log/log[!1-2]* -> tail files without 1-2
## "/var/log/log[^1-2]* -> identical behavior as above
## See https://github.com/gobwas/glob for more examples
##
files = ["/var/mymetrics.out"]
## Offset to start reading at
## The following methods are available:
## beginning -- start reading from the beginning of the file ignoring any persisted offset
## end -- start reading from the end of the file ignoring any persisted offset
## saved-or-beginning -- use the persisted offset of the file or, if no offset persisted, start from the beginning of the file
## saved-or-end -- use the persisted offset of the file or, if no offset persisted, start from the end of the file
# initial_read_offset = "saved-or-end"
## Whether file is a named pipe
# pipe = false
## Method used to watch for file updates. Can be either "inotify" or "poll".
## inotify is supported on linux, *bsd, and macOS, while Windows requires
## using poll. Poll checks for changes every 250ms.
# watch_method = "inotify"
## Maximum lines of the file to process that have not yet be written by the
## output. For best throughput set based on the number of metrics on each
## line and the size of the output's metric_batch_size.
# max_undelivered_lines = 1000
## Character encoding to use when interpreting the file contents. Invalid
## characters are replaced using the unicode replacement character. When set
## to the empty string the data is not decoded to text.
## ex: character_encoding = "utf-8"
## character_encoding = "utf-16le"
## character_encoding = "utf-16be"
## character_encoding = ""
# character_encoding = ""
## Data format to consume.
## Each data format has its own unique set of configuration options, read
## more about them here:
## https://github.com/influxdata/telegraf/blob/master/docs/DATA_FORMATS_INPUT.md
data_format = "influx"
## Set the tag that will contain the path of the tailed file. If you don't want this tag, set it to an empty string.
# path_tag = "path"
## Filters to apply to files before generating metrics
## "ansi_color" removes ANSI colors
# filters = []
## multiline parser/codec
## https://www.elastic.co/guide/en/logstash/2.4/plugins-filters-multiline.html
#[inputs.tail.multiline]
## The pattern should be a regexp which matches what you believe to be an indicator that the field is part of an event consisting of multiple lines of log data.
#pattern = "^\s"
## The field's value must be previous or next and indicates the relation to the
## multi-line event.
#match_which_line = "previous"
## The invert_match can be true or false (defaults to false).
## If true, a message not matching the pattern will constitute a match of the multiline filter and the what will be applied. (vice-versa is also true)
#invert_match = false
## The handling method for quoted text (defaults to 'ignore').
## The following methods are available:
## ignore -- do not consider quotation (default)
## single-quotes -- consider text quoted by single quotes (')
## double-quotes -- consider text quoted by double quotes (")
## backticks -- consider text quoted by backticks (`)
## When handling quotes, escaped quotes (e.g. \") are handled correctly.
#quotation = "ignore"
## The preserve_newline option can be true or false (defaults to false).
## If true, the newline character is preserved for multiline elements,
## this is useful to preserve message-structure e.g. for logging outputs.
#preserve_newline = false
#After the specified timeout, this plugin sends the multiline event even if no new pattern is found to start a new event. The default is 5s.
#timeout = 5s

525
plugins/inputs/tail/tail.go Normal file
View file

@ -0,0 +1,525 @@
//go:generate ../../../tools/readme_config_includer/generator
//go:build !solaris
package tail
import (
"bytes"
"context"
_ "embed"
"errors"
"fmt"
"io"
"strings"
"sync"
"time"
"github.com/dimchansky/utfbom"
"github.com/influxdata/tail"
"github.com/pborman/ansi"
"github.com/influxdata/telegraf"
"github.com/influxdata/telegraf/internal"
"github.com/influxdata/telegraf/internal/globpath"
"github.com/influxdata/telegraf/plugins/common/encoding"
"github.com/influxdata/telegraf/plugins/inputs"
"github.com/influxdata/telegraf/plugins/parsers"
)
//go:embed sample.conf
var sampleConfig string
var (
once sync.Once
offsets = make(map[string]int64)
offsetsMutex = new(sync.Mutex)
)
type Tail struct {
Files []string `toml:"files"`
FromBeginning bool `toml:"from_beginning" deprecated:"1.34.0;1.40.0;use 'initial_read_offset' with value 'beginning' instead"`
InitialReadOffset string `toml:"initial_read_offset"`
Pipe bool `toml:"pipe"`
WatchMethod string `toml:"watch_method"`
MaxUndeliveredLines int `toml:"max_undelivered_lines"`
CharacterEncoding string `toml:"character_encoding"`
PathTag string `toml:"path_tag"`
Filters []string `toml:"filters"`
filterColors bool
Log telegraf.Logger `toml:"-"`
tailers map[string]*tail.Tail
tailersMutex sync.RWMutex
offsets map[string]int64
parserFunc telegraf.ParserFunc
wg sync.WaitGroup
acc telegraf.TrackingAccumulator
MultilineConfig multilineConfig `toml:"multiline"`
multiline *multiline
ctx context.Context
cancel context.CancelFunc
sem semaphore
decoder *encoding.Decoder
}
type empty struct{}
type semaphore chan empty
func (*Tail) SampleConfig() string {
return sampleConfig
}
func (t *Tail) SetParserFunc(fn telegraf.ParserFunc) {
t.parserFunc = fn
}
func (t *Tail) Init() error {
// Backward compatibility setting
if t.InitialReadOffset == "" {
if t.FromBeginning {
t.InitialReadOffset = "beginning"
} else {
t.InitialReadOffset = "saved-or-end"
}
}
// Check settings
switch t.InitialReadOffset {
case "":
t.InitialReadOffset = "saved-or-end"
case "beginning", "end", "saved-or-end", "saved-or-beginning":
default:
return fmt.Errorf("invalid 'initial_read_offset' setting %q", t.InitialReadOffset)
}
if t.MaxUndeliveredLines == 0 {
return errors.New("max_undelivered_lines must be positive")
}
t.sem = make(semaphore, t.MaxUndeliveredLines)
for _, filter := range t.Filters {
if filter == "ansi_color" {
t.filterColors = true
}
}
// init offsets
t.offsets = make(map[string]int64)
dec, err := encoding.NewDecoder(t.CharacterEncoding)
if err != nil {
return fmt.Errorf("creating decoder failed: %w", err)
}
t.decoder = dec
return nil
}
func (t *Tail) Start(acc telegraf.Accumulator) error {
t.acc = acc.WithTracking(t.MaxUndeliveredLines)
t.ctx, t.cancel = context.WithCancel(context.Background())
t.wg.Add(1)
go func() {
defer t.wg.Done()
for {
select {
case <-t.ctx.Done():
return
case <-t.acc.Delivered():
<-t.sem
}
}
}()
var err error
t.multiline, err = t.MultilineConfig.newMultiline()
if err != nil {
return err
}
t.tailers = make(map[string]*tail.Tail)
err = t.tailNewFiles()
if err != nil {
return err
}
// assumption that once Start is called, all parallel plugins have already been initialized
offsetsMutex.Lock()
offsets = make(map[string]int64)
offsetsMutex.Unlock()
return err
}
func (t *Tail) getSeekInfo(file string) (*tail.SeekInfo, error) {
// Pipes do not support seeking
if t.Pipe {
return nil, nil
}
// Determine the actual position for continuing
switch t.InitialReadOffset {
case "beginning":
return &tail.SeekInfo{Whence: 0, Offset: 0}, nil
case "end":
return &tail.SeekInfo{Whence: 2, Offset: 0}, nil
case "", "saved-or-end":
if offset, ok := t.offsets[file]; ok {
t.Log.Debugf("Using offset %d for %q", offset, file)
return &tail.SeekInfo{Whence: 0, Offset: offset}, nil
}
return &tail.SeekInfo{Whence: 2, Offset: 0}, nil
case "saved-or-beginning":
if offset, ok := t.offsets[file]; ok {
t.Log.Debugf("Using offset %d for %q", offset, file)
return &tail.SeekInfo{Whence: 0, Offset: offset}, nil
}
return &tail.SeekInfo{Whence: 0, Offset: 0}, nil
default:
return nil, errors.New("invalid 'initial_read_offset' setting")
}
}
func (t *Tail) GetState() interface{} {
return t.offsets
}
func (t *Tail) SetState(state interface{}) error {
offsetsState, ok := state.(map[string]int64)
if !ok {
return errors.New("state has to be of type 'map[string]int64'")
}
for k, v := range offsetsState {
t.offsets[k] = v
}
return nil
}
func (t *Tail) Gather(_ telegraf.Accumulator) error {
return t.tailNewFiles()
}
func (t *Tail) Stop() {
t.tailersMutex.Lock()
defer t.tailersMutex.Unlock()
for filename, tailer := range t.tailers {
if !t.Pipe {
// store offset for resume
offset, err := tailer.Tell()
if err == nil {
t.Log.Debugf("Recording offset %d for %q", offset, tailer.Filename)
t.offsets[tailer.Filename] = offset
} else {
t.Log.Errorf("Recording offset for %q: %s", tailer.Filename, err.Error())
}
}
err := tailer.Stop()
if err != nil {
t.Log.Errorf("Stopping tail on %q: %s", tailer.Filename, err.Error())
}
// Explicitly delete the tailer from the map to avoid memory leaks
delete(t.tailers, filename)
}
t.cancel()
t.wg.Wait()
// persist offsets
offsetsMutex.Lock()
for k, v := range t.offsets {
offsets[k] = v
}
offsetsMutex.Unlock()
}
func (t *Tail) tailNewFiles() error {
var poll bool
if t.WatchMethod == "poll" {
poll = true
}
// Track files that we're currently processing
currentFiles := make(map[string]bool)
// Create a "tailer" for each file
for _, filepath := range t.Files {
g, err := globpath.Compile(filepath)
if err != nil {
t.Log.Errorf("Glob %q failed to compile: %s", filepath, err.Error())
continue
}
for _, file := range g.Match() {
// Mark this file as currently being processed
currentFiles[file] = true
// Check if we're already tailing this file
t.tailersMutex.RLock()
_, alreadyTailing := t.tailers[file]
t.tailersMutex.RUnlock()
if alreadyTailing {
// already tailing this file
continue
}
seek, err := t.getSeekInfo(file)
if err != nil {
return err
}
tailer, err := tail.TailFile(file,
tail.Config{
ReOpen: true,
Follow: true,
Location: seek,
MustExist: true,
Poll: poll,
Pipe: t.Pipe,
Logger: tail.DiscardingLogger,
OpenReaderFunc: func(rd io.Reader) io.Reader {
r, _ := utfbom.Skip(t.decoder.Reader(rd))
return r
},
})
if err != nil {
t.Log.Debugf("Failed to open file (%s): %v", file, err)
continue
}
t.Log.Debugf("Tail added for %q", file)
parser, err := t.parserFunc()
if err != nil {
t.Log.Errorf("Creating parser: %s", err.Error())
continue
}
// create a goroutine for each "tailer"
t.wg.Add(1)
// Store the tailer in the map before starting the goroutine
t.tailersMutex.Lock()
t.tailers[tailer.Filename] = tailer
t.tailersMutex.Unlock()
go func(tl *tail.Tail) {
defer t.wg.Done()
t.receiver(parser, tl)
t.Log.Debugf("Tail removed for %q", tl.Filename)
if err := tl.Err(); err != nil {
if strings.HasSuffix(err.Error(), "permission denied") {
t.Log.Errorf("Deleting tailer for %q due to: %v", tl.Filename, err)
t.tailersMutex.Lock()
delete(t.tailers, tl.Filename)
t.tailersMutex.Unlock()
} else {
t.Log.Errorf("Tailing %q: %s", tl.Filename, err.Error())
}
}
}(tailer)
}
}
// Clean up tailers for files that are no longer being monitored
return t.cleanupUnusedTailers(currentFiles)
}
// cleanupUnusedTailers stops and removes tailers for files that are no longer being monitored.
// It uses defer to ensure the mutex is always unlocked, even if errors occur.
func (t *Tail) cleanupUnusedTailers(currentFiles map[string]bool) error {
t.tailersMutex.Lock()
defer t.tailersMutex.Unlock()
for file, tailer := range t.tailers {
if !currentFiles[file] {
// This file is no longer in our glob pattern matches
// We need to stop tailing it and remove it from our list
t.Log.Debugf("Removing tailer for %q as it's no longer in the glob pattern", file)
// Save the current offset for potential future use
if !t.Pipe {
offset, err := tailer.Tell()
if err == nil {
t.Log.Debugf("Recording offset %d for %q", offset, tailer.Filename)
t.offsets[tailer.Filename] = offset
} else {
t.Log.Errorf("Recording offset for %q: %s", tailer.Filename, err.Error())
}
}
// Stop the tailer
err := tailer.Stop()
if err != nil {
t.Log.Errorf("Stopping tail on %q: %s", tailer.Filename, err.Error())
}
// Remove from our map
delete(t.tailers, file)
}
}
return nil
}
func parseLine(parser telegraf.Parser, line string) ([]telegraf.Metric, error) {
m, err := parser.Parse([]byte(line))
if err != nil {
if errors.Is(err, parsers.ErrEOF) {
return nil, nil
}
return nil, err
}
return m, err
}
// receiver is launched as a goroutine to continuously watch a tailed logfile
// for changes, parse any incoming messages, and add to the accumulator.
func (t *Tail) receiver(parser telegraf.Parser, tailer *tail.Tail) {
// holds the individual lines of multi-line log entries.
var buffer bytes.Buffer
var timer *time.Timer
var timeout <-chan time.Time
// The multiline mode requires a timer in order to flush the multiline buffer
// if no new lines are incoming.
if t.multiline.isEnabled() {
timer = time.NewTimer(time.Duration(*t.MultilineConfig.Timeout))
timeout = timer.C
}
channelOpen := true
tailerOpen := true
var line *tail.Line
for {
line = nil
if timer != nil {
timer.Reset(time.Duration(*t.MultilineConfig.Timeout))
}
select {
case <-t.ctx.Done():
channelOpen = false
case line, tailerOpen = <-tailer.Lines:
if !tailerOpen {
channelOpen = false
}
case <-timeout:
}
var text string
if line != nil {
// Fix up files with Windows line endings.
text = strings.TrimRight(line.Text, "\r")
if t.multiline.isEnabled() {
if text = t.multiline.processLine(text, &buffer); text == "" {
continue
}
}
}
if line == nil || !channelOpen || !tailerOpen {
if text += flush(&buffer); text == "" {
if !channelOpen {
return
}
continue
}
}
if line != nil && line.Err != nil {
t.Log.Errorf("Tailing %q: %s", tailer.Filename, line.Err.Error())
continue
}
if t.filterColors {
out, err := ansi.Strip([]byte(text))
if err != nil {
t.Log.Errorf("Cannot strip ansi colors from %s: %s", text, err)
}
text = string(out)
}
metrics, err := parseLine(parser, text)
if err != nil {
t.Log.Errorf("Malformed log line in %q: [%q]: %s",
tailer.Filename, text, err.Error())
continue
}
if len(metrics) == 0 {
once.Do(func() {
t.Log.Debug(internal.NoMetricsCreatedMsg)
})
}
if t.PathTag != "" {
for _, metric := range metrics {
metric.AddTag(t.PathTag, tailer.Filename)
}
}
// try writing out metric first without blocking
select {
case t.sem <- empty{}:
t.acc.AddTrackingMetricGroup(metrics)
if t.ctx.Err() != nil {
return // exit!
}
continue // next loop
default:
// no room. switch to blocking write.
}
// Block until plugin is stopping or room is available to add metrics.
select {
case <-t.ctx.Done():
return
// Tail is trying to close so drain the sem to allow the receiver
// to exit. This condition is hit when the tailer may have hit the
// maximum undelivered lines and is trying to close.
case <-tailer.Dying():
<-t.sem
case t.sem <- empty{}:
t.acc.AddTrackingMetricGroup(metrics)
}
}
}
func newTail() *Tail {
offsetsMutex.Lock()
offsetsCopy := make(map[string]int64, len(offsets))
for k, v := range offsets {
offsetsCopy[k] = v
}
offsetsMutex.Unlock()
return &Tail{
MaxUndeliveredLines: 1000,
offsets: offsetsCopy,
PathTag: "path",
}
}
func init() {
inputs.Add("tail", func() telegraf.Input {
return newTail()
})
}

View file

@ -0,0 +1,36 @@
// Skipping plugin on Solaris due to fsnotify support
//
//go:build solaris
package tail
import (
_ "embed"
"github.com/influxdata/telegraf"
"github.com/influxdata/telegraf/plugins/inputs"
)
//go:embed sample.conf
var sampleConfig string
type Tail struct {
Log telegraf.Logger `toml:"-"`
}
func (*Tail) SampleConfig() string {
return sampleConfig
}
func (h *Tail) Init() error {
h.Log.Warn("Current platform is not supported")
return nil
}
func (*Tail) Gather(telegraf.Accumulator) error { return nil }
func init() {
inputs.Add("tail", func() telegraf.Input {
return &Tail{}
})
}

File diff suppressed because it is too large Load diff

Binary file not shown.

Binary file not shown.

View file

@ -0,0 +1,5 @@
cpu,cpu=cpu0 usage_active=11.9 1594084375000000000
cpu,cpu=cpu1 usage_active=26.0 1594084375000000000
cpu,cpu=cpu2 usage_active=14.0 1594084375000000000
cpu,cpu=cpu3 usage_active=20.4 1594084375000000000
cpu,cpu=cpu-total usage_active=18.4 1594084375000000000

View file

@ -0,0 +1,12 @@
1660819827410,1,some text without quotes,A
1660819827411,1,`some text all quoted`,A
1660819827412,1,`some text all quoted
but wrapped`,A
1660819827420,2,some text with `quotes`,B
1660819827430,3,some text with 'multiple "quotes" in `one` line',C
1660819827440,4,some multiline text with `quotes
spanning \`multiple\`
lines` but do not `end
directly`,D
1660819827450,5,all of `this` should `basically` work...,E
Can't render this file because it contains an unexpected character in line 6 and column 42.

View file

@ -0,0 +1,12 @@
1660819827410,1,some text without quotes,A
1660819827411,1,"some text all quoted",A
1660819827412,1,"some text all quoted
but wrapped",A
1660819827420,2,some text with "quotes",B
1660819827430,3,some text with 'multiple "quotes" in `one` line',C
1660819827440,4,some multiline text with "quotes
spanning \"multiple\"
lines" but do not "end
directly",D
1660819827450,5,all of "this" should "basically" work...,E
Can't render this file because it contains an unexpected character in line 5 and column 32.

View file

@ -0,0 +1,4 @@
1660819827410,1,some text without quotes,A
1660819827411,1,'some text all quoted,A
1660819827412,1,'some text all quoted
but wrapped,A
1 1660819827410,1,some text without quotes,A
2 1660819827411,1,'some text all quoted,A
3 1660819827412,1,'some text all quoted
4 but wrapped,A

View file

@ -0,0 +1,2 @@
1660819827411,2,'some text all quoted,B
1660819827410,1,some text without quotes,A
1 1660819827411 2 'some text all quoted B
2 1660819827410 1 some text without quotes A

View file

@ -0,0 +1,12 @@
1660819827410,1,some text without quotes,A
1660819827411,1,'some text all quoted',A
1660819827412,1,'some text all quoted
but wrapped',A
1660819827420,2,some text with 'quotes',B
1660819827430,3,some text with 'multiple "quotes" in `one` line',C
1660819827440,4,some multiline text with 'quotes
spanning \'multiple\'
lines' but do not 'end
directly',D
1660819827450,5,all of 'this' should 'basically' work...,E
Can't render this file because it contains an unexpected character in line 6 and column 42.

View file

@ -0,0 +1,3 @@
# Test multiline
# [04/Jun/2016:12:41:45 +0100] DEBUG HelloExample: This is debug
TEST_LOG_MULTILINE \[%{HTTPDATE:timestamp:ts-httpd}\] %{WORD:loglevel:tag} %{GREEDYDATA:message}

View file

@ -0,0 +1,7 @@
[04/Jun/2016:12:41:45 +0100] DEBUG HelloExample: This is debug
[04/Jun/2016:12:41:48 +0100] INFO HelloExample: This is info
[04/Jun/2016:12:41:46 +0100] ERROR HelloExample: Sorry, something wrong!
java.lang.ArithmeticException: / by zero
at com.foo.HelloExample2.divide(HelloExample2.java:24)
at com.foo.HelloExample2.main(HelloExample2.java:14)
[04/Jun/2016:12:41:48 +0100] WARN HelloExample: This is warn