Adding upstream version 0.5.0.
Signed-off-by: Daniel Baumann <daniel@debian.org>
This commit is contained in:
parent
f9051e9424
commit
16e40566d2
8 changed files with 1303 additions and 0 deletions
29
LICENSE.md
Normal file
29
LICENSE.md
Normal file
|
@ -0,0 +1,29 @@
|
||||||
|
BSD 3-Clause License
|
||||||
|
|
||||||
|
Copyright (c) 2022-present, Gani Georgiev
|
||||||
|
All rights reserved.
|
||||||
|
|
||||||
|
Redistribution and use in source and binary forms, with or without
|
||||||
|
modification, are permitted provided that the following conditions are met:
|
||||||
|
|
||||||
|
1. Redistributions of source code must retain the above copyright notice, this
|
||||||
|
list of conditions and the following disclaimer.
|
||||||
|
|
||||||
|
2. Redistributions in binary form must reproduce the above copyright notice,
|
||||||
|
this list of conditions and the following disclaimer in the documentation
|
||||||
|
and/or other materials provided with the distribution.
|
||||||
|
|
||||||
|
3. Neither the name of the copyright holder nor the names of its
|
||||||
|
contributors may be used to endorse or promote products derived from
|
||||||
|
this software without specific prior written permission.
|
||||||
|
|
||||||
|
THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS"
|
||||||
|
AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE
|
||||||
|
IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE
|
||||||
|
DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT HOLDER OR CONTRIBUTORS BE LIABLE
|
||||||
|
FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL
|
||||||
|
DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR
|
||||||
|
SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER
|
||||||
|
CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY,
|
||||||
|
OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
|
||||||
|
OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
|
118
README.md
Normal file
118
README.md
Normal file
|
@ -0,0 +1,118 @@
|
||||||
|
fexpr
|
||||||
|
[](https://goreportcard.com/report/github.com/ganigeorgiev/fexpr)
|
||||||
|
[](https://pkg.go.dev/github.com/ganigeorgiev/fexpr)
|
||||||
|
================================================================================
|
||||||
|
|
||||||
|
**fexpr** is a filter query language parser that generates easy to work with AST structure so that you can create safely SQL, Elasticsearch, etc. queries from user input.
|
||||||
|
|
||||||
|
Or in other words, transform the string `"id > 1"` into the struct `[{&& {{identifier id} > {number 1}}}]`.
|
||||||
|
|
||||||
|
Supports parenthesis and various conditional expression operators (see [Grammar](https://github.com/ganigeorgiev/fexpr#grammar)).
|
||||||
|
|
||||||
|
|
||||||
|
## Example usage
|
||||||
|
|
||||||
|
```
|
||||||
|
go get github.com/ganigeorgiev/fexpr
|
||||||
|
```
|
||||||
|
|
||||||
|
```go
|
||||||
|
package main
|
||||||
|
|
||||||
|
import github.com/ganigeorgiev/fexpr
|
||||||
|
|
||||||
|
func main() {
|
||||||
|
// [{&& {{identifier id} = {number 123}}} {&& {{identifier status} = {text active}}}]
|
||||||
|
result, err := fexpr.Parse("id=123 && status='active'")
|
||||||
|
}
|
||||||
|
```
|
||||||
|
|
||||||
|
> Note that each parsed expression statement contains a join/union operator (`&&` or `||`) so that the result can be consumed on small chunks without having to rely on the group/nesting context.
|
||||||
|
|
||||||
|
> See the [package documentation](https://pkg.go.dev/github.com/ganigeorgiev/fexpr) for more details and examples.
|
||||||
|
|
||||||
|
|
||||||
|
## Grammar
|
||||||
|
|
||||||
|
**fexpr** grammar resembles the SQL `WHERE` expression syntax. It recognizes several token types (identifiers, numbers, quoted text, expression operators, whitespaces, etc.).
|
||||||
|
|
||||||
|
> You could find all supported tokens in [`scanner.go`](https://github.com/ganigeorgiev/fexpr/blob/master/scanner.go).
|
||||||
|
|
||||||
|
#### Operators
|
||||||
|
|
||||||
|
- **`=`** Equal operator (eg. `a=b`)
|
||||||
|
- **`!=`** NOT Equal operator (eg. `a!=b`)
|
||||||
|
- **`>`** Greater than operator (eg. `a>b`)
|
||||||
|
- **`>=`** Greater than or equal operator (eg. `a>=b`)
|
||||||
|
- **`<`** Less than or equal operator (eg. `a<b`)
|
||||||
|
- **`<=`** Less than or equal operator (eg. `a<=b`)
|
||||||
|
- **`~`** Like/Contains operator (eg. `a~b`)
|
||||||
|
- **`!~`** NOT Like/Contains operator (eg. `a!~b`)
|
||||||
|
- **`?=`** Array/Any equal operator (eg. `a?=b`)
|
||||||
|
- **`?!=`** Array/Any NOT Equal operator (eg. `a?!=b`)
|
||||||
|
- **`?>`** Array/Any Greater than operator (eg. `a?>b`)
|
||||||
|
- **`?>=`** Array/Any Greater than or equal operator (eg. `a?>=b`)
|
||||||
|
- **`?<`** Array/Any Less than or equal operator (eg. `a?<b`)
|
||||||
|
- **`?<=`** Array/Any Less than or equal operator (eg. `a?<=b`)
|
||||||
|
- **`?~`** Array/Any Like/Contains operator (eg. `a?~b`)
|
||||||
|
- **`?!~`** Array/Any NOT Like/Contains operator (eg. `a?!~b`)
|
||||||
|
- **`&&`** AND join operator (eg. `a=b && c=d`)
|
||||||
|
- **`||`** OR join operator (eg. `a=b || c=d`)
|
||||||
|
- **`()`** Parenthesis (eg. `(a=1 && b=2) || (a=3 && b=4)`)
|
||||||
|
|
||||||
|
#### Numbers
|
||||||
|
Number tokens are any integer or decimal numbers.
|
||||||
|
|
||||||
|
_Example_: `123`, `10.50`, `-14`.
|
||||||
|
|
||||||
|
#### Quoted text
|
||||||
|
|
||||||
|
Text tokens are any literals that are wrapped by `'` or `"` quotes.
|
||||||
|
|
||||||
|
_Example_: `'Lorem ipsum dolor 123!'`, `"escaped \"word\""`, `"mixed 'quotes' are fine"`.
|
||||||
|
|
||||||
|
#### Identifiers
|
||||||
|
|
||||||
|
Identifier tokens are literals that start with a letter, `_`, `@` or `#` and could contain further any number of letters, digits, `.` (usually used as a separator) or `:` (usually used as modifier) characters.
|
||||||
|
|
||||||
|
_Example_: `id`, `a.b.c`, `field123`, `@request.method`, `author.name:length`.
|
||||||
|
|
||||||
|
#### Functions
|
||||||
|
|
||||||
|
Function tokens are similar to the identifiers but in addition accept a list of arguments enclosed in parenthesis `()`.
|
||||||
|
The function arguments must be separated by comma (_a single trailing comma is also allowed_) and each argument can be an identifier, quoted text, number or another nested function (_up to 2 nested_).
|
||||||
|
|
||||||
|
_Example_: `test()`, `test(a.b, 123, "abc")`, `@a.b.c:test(true)`, `a(b(c(1, 2)))`.
|
||||||
|
|
||||||
|
#### Comments
|
||||||
|
|
||||||
|
Comment tokens are any single line text literals starting with `//`.
|
||||||
|
Similar to whitespaces, comments are ignored by `fexpr.Parse()`.
|
||||||
|
|
||||||
|
_Example_: `// test`.
|
||||||
|
|
||||||
|
|
||||||
|
## Using only the scanner
|
||||||
|
|
||||||
|
The tokenizer (aka. `fexpr.Scanner`) could be used without the parser's state machine so that you can write your own custom tokens processing:
|
||||||
|
|
||||||
|
```go
|
||||||
|
s := fexpr.NewScanner([]byte("id > 123"))
|
||||||
|
|
||||||
|
// scan single token at a time until EOF or error is reached
|
||||||
|
for {
|
||||||
|
t, err := s.Scan()
|
||||||
|
if t.Type == fexpr.TokenEOF || err != nil {
|
||||||
|
break
|
||||||
|
}
|
||||||
|
|
||||||
|
fmt.Println(t)
|
||||||
|
}
|
||||||
|
|
||||||
|
// Output:
|
||||||
|
// {<nil> identifier id}
|
||||||
|
// {<nil> whitespace }
|
||||||
|
// {<nil> sign >}
|
||||||
|
// {<nil> whitespace }
|
||||||
|
// {<nil> number 123}
|
||||||
|
```
|
36
examples_test.go
Normal file
36
examples_test.go
Normal file
|
@ -0,0 +1,36 @@
|
||||||
|
package fexpr_test
|
||||||
|
|
||||||
|
import (
|
||||||
|
"fmt"
|
||||||
|
|
||||||
|
"github.com/ganigeorgiev/fexpr"
|
||||||
|
)
|
||||||
|
|
||||||
|
func ExampleScanner_Scan() {
|
||||||
|
s := fexpr.NewScanner([]byte("id > 123"))
|
||||||
|
|
||||||
|
for {
|
||||||
|
t, err := s.Scan()
|
||||||
|
if t.Type == fexpr.TokenEOF || err != nil {
|
||||||
|
break
|
||||||
|
}
|
||||||
|
|
||||||
|
fmt.Println(t)
|
||||||
|
}
|
||||||
|
|
||||||
|
// Output:
|
||||||
|
// {<nil> identifier id}
|
||||||
|
// {<nil> whitespace }
|
||||||
|
// {<nil> sign >}
|
||||||
|
// {<nil> whitespace }
|
||||||
|
// {<nil> number 123}
|
||||||
|
}
|
||||||
|
|
||||||
|
func ExampleParse() {
|
||||||
|
result, _ := fexpr.Parse("id > 123")
|
||||||
|
|
||||||
|
fmt.Println(result)
|
||||||
|
|
||||||
|
// Output:
|
||||||
|
// [{{{<nil> identifier id} > {<nil> number 123}} &&}]
|
||||||
|
}
|
3
go.mod
Normal file
3
go.mod
Normal file
|
@ -0,0 +1,3 @@
|
||||||
|
module github.com/ganigeorgiev/fexpr
|
||||||
|
|
||||||
|
go 1.16
|
130
parser.go
Normal file
130
parser.go
Normal file
|
@ -0,0 +1,130 @@
|
||||||
|
package fexpr
|
||||||
|
|
||||||
|
import (
|
||||||
|
"errors"
|
||||||
|
"fmt"
|
||||||
|
)
|
||||||
|
|
||||||
|
var ErrEmpty = errors.New("empty filter expression")
|
||||||
|
var ErrIncomplete = errors.New("invalid or incomplete filter expression")
|
||||||
|
var ErrInvalidComment = errors.New("invalid comment")
|
||||||
|
|
||||||
|
// Expr represents an individual tokenized expression consisting
|
||||||
|
// of left operand, operator and a right operand.
|
||||||
|
type Expr struct {
|
||||||
|
Left Token
|
||||||
|
Op SignOp
|
||||||
|
Right Token
|
||||||
|
}
|
||||||
|
|
||||||
|
// IsZero checks if the current Expr has zero-valued props.
|
||||||
|
func (e Expr) IsZero() bool {
|
||||||
|
return e.Op == "" && e.Left.Literal == "" && e.Left.Type == "" && e.Right.Literal == "" && e.Right.Type == ""
|
||||||
|
}
|
||||||
|
|
||||||
|
// ExprGroup represents a wrapped expression and its join type.
|
||||||
|
//
|
||||||
|
// The group's Item could be either an `Expr` instance or `[]ExprGroup` slice (for nested expressions).
|
||||||
|
type ExprGroup struct {
|
||||||
|
Item interface{}
|
||||||
|
Join JoinOp
|
||||||
|
}
|
||||||
|
|
||||||
|
// parser's state machine steps
|
||||||
|
const (
|
||||||
|
stepBeforeSign = iota
|
||||||
|
stepSign
|
||||||
|
stepAfterSign
|
||||||
|
StepJoin
|
||||||
|
)
|
||||||
|
|
||||||
|
// Parse parses the provided text and returns its processed AST
|
||||||
|
// in the form of `ExprGroup` slice(s).
|
||||||
|
//
|
||||||
|
// Comments and whitespaces are ignored.
|
||||||
|
func Parse(text string) ([]ExprGroup, error) {
|
||||||
|
result := []ExprGroup{}
|
||||||
|
scanner := NewScanner([]byte(text))
|
||||||
|
step := stepBeforeSign
|
||||||
|
join := JoinAnd
|
||||||
|
|
||||||
|
var expr Expr
|
||||||
|
|
||||||
|
for {
|
||||||
|
t, err := scanner.Scan()
|
||||||
|
if err != nil {
|
||||||
|
return nil, err
|
||||||
|
}
|
||||||
|
|
||||||
|
if t.Type == TokenEOF {
|
||||||
|
break
|
||||||
|
}
|
||||||
|
|
||||||
|
if t.Type == TokenWS || t.Type == TokenComment {
|
||||||
|
continue
|
||||||
|
}
|
||||||
|
|
||||||
|
if t.Type == TokenGroup {
|
||||||
|
groupResult, err := Parse(t.Literal)
|
||||||
|
if err != nil {
|
||||||
|
return nil, err
|
||||||
|
}
|
||||||
|
|
||||||
|
// append only if non-empty group
|
||||||
|
if len(groupResult) > 0 {
|
||||||
|
result = append(result, ExprGroup{Join: join, Item: groupResult})
|
||||||
|
}
|
||||||
|
|
||||||
|
step = StepJoin
|
||||||
|
continue
|
||||||
|
}
|
||||||
|
|
||||||
|
switch step {
|
||||||
|
case stepBeforeSign:
|
||||||
|
if t.Type != TokenIdentifier && t.Type != TokenText && t.Type != TokenNumber && t.Type != TokenFunction {
|
||||||
|
return nil, fmt.Errorf("expected left operand (identifier, function, text or number), got %q (%s)", t.Literal, t.Type)
|
||||||
|
}
|
||||||
|
|
||||||
|
expr = Expr{Left: t}
|
||||||
|
|
||||||
|
step = stepSign
|
||||||
|
case stepSign:
|
||||||
|
if t.Type != TokenSign {
|
||||||
|
return nil, fmt.Errorf("expected a sign operator, got %q (%s)", t.Literal, t.Type)
|
||||||
|
}
|
||||||
|
|
||||||
|
expr.Op = SignOp(t.Literal)
|
||||||
|
step = stepAfterSign
|
||||||
|
case stepAfterSign:
|
||||||
|
if t.Type != TokenIdentifier && t.Type != TokenText && t.Type != TokenNumber && t.Type != TokenFunction {
|
||||||
|
return nil, fmt.Errorf("expected right operand (identifier, function text or number), got %q (%s)", t.Literal, t.Type)
|
||||||
|
}
|
||||||
|
|
||||||
|
expr.Right = t
|
||||||
|
result = append(result, ExprGroup{Join: join, Item: expr})
|
||||||
|
|
||||||
|
step = StepJoin
|
||||||
|
case StepJoin:
|
||||||
|
if t.Type != TokenJoin {
|
||||||
|
return nil, fmt.Errorf("expected && or ||, got %q (%s)", t.Literal, t.Type)
|
||||||
|
}
|
||||||
|
|
||||||
|
join = JoinAnd
|
||||||
|
if t.Literal == "||" {
|
||||||
|
join = JoinOr
|
||||||
|
}
|
||||||
|
|
||||||
|
step = stepBeforeSign
|
||||||
|
}
|
||||||
|
}
|
||||||
|
|
||||||
|
if step != StepJoin {
|
||||||
|
if len(result) == 0 && expr.IsZero() {
|
||||||
|
return nil, ErrEmpty
|
||||||
|
}
|
||||||
|
|
||||||
|
return nil, ErrIncomplete
|
||||||
|
}
|
||||||
|
|
||||||
|
return result, nil
|
||||||
|
}
|
142
parser_test.go
Normal file
142
parser_test.go
Normal file
|
@ -0,0 +1,142 @@
|
||||||
|
package fexpr
|
||||||
|
|
||||||
|
import (
|
||||||
|
"fmt"
|
||||||
|
"testing"
|
||||||
|
)
|
||||||
|
|
||||||
|
func TestExprIzZero(t *testing.T) {
|
||||||
|
scenarios := []struct {
|
||||||
|
expr Expr
|
||||||
|
result bool
|
||||||
|
}{
|
||||||
|
{Expr{}, true},
|
||||||
|
{Expr{Op: SignAnyEq}, false},
|
||||||
|
{Expr{Left: Token{Literal: "123"}}, false},
|
||||||
|
{Expr{Left: Token{Type: TokenWS}}, false},
|
||||||
|
{Expr{Right: Token{Literal: "123"}}, false},
|
||||||
|
{Expr{Right: Token{Type: TokenWS}}, false},
|
||||||
|
}
|
||||||
|
|
||||||
|
for i, s := range scenarios {
|
||||||
|
t.Run(fmt.Sprintf("s%d", i), func(t *testing.T) {
|
||||||
|
if v := s.expr.IsZero(); v != s.result {
|
||||||
|
t.Fatalf("Expected %v, got %v for \n%v", s.result, v, s.expr)
|
||||||
|
}
|
||||||
|
})
|
||||||
|
}
|
||||||
|
}
|
||||||
|
|
||||||
|
func TestParse(t *testing.T) {
|
||||||
|
scenarios := []struct {
|
||||||
|
input string
|
||||||
|
expectedError bool
|
||||||
|
expectedPrint string
|
||||||
|
}{
|
||||||
|
{`> 1`, true, "[]"},
|
||||||
|
{`a >`, true, "[]"},
|
||||||
|
{`a > >`, true, "[]"},
|
||||||
|
{`a > %`, true, "[]"},
|
||||||
|
{`a ! 1`, true, "[]"},
|
||||||
|
{`a - 1`, true, "[]"},
|
||||||
|
{`a + 1`, true, "[]"},
|
||||||
|
{`1 - 1`, true, "[]"},
|
||||||
|
{`1 + 1`, true, "[]"},
|
||||||
|
{`> a 1`, true, "[]"},
|
||||||
|
{`a || 1`, true, "[]"},
|
||||||
|
{`a && 1`, true, "[]"},
|
||||||
|
{`test > 1 &&`, true, `[]`},
|
||||||
|
{`|| test = 1`, true, `[]`},
|
||||||
|
{`test = 1 && ||`, true, "[]"},
|
||||||
|
{`test = 1 && a`, true, "[]"},
|
||||||
|
{`test = 1 && a`, true, "[]"},
|
||||||
|
{`test = 1 && "a"`, true, "[]"},
|
||||||
|
{`test = 1 a`, true, "[]"},
|
||||||
|
{`test = 1 a`, true, "[]"},
|
||||||
|
{`test = 1 "a"`, true, "[]"},
|
||||||
|
{`test = 1@test`, true, "[]"},
|
||||||
|
{`test = .@test`, true, "[]"},
|
||||||
|
// mismatched text quotes
|
||||||
|
{`test = "demo'`, true, "[]"},
|
||||||
|
{`test = 'demo"`, true, "[]"},
|
||||||
|
{`test = 'demo'"`, true, "[]"},
|
||||||
|
{`test = 'demo''`, true, "[]"},
|
||||||
|
{`test = "demo"'`, true, "[]"},
|
||||||
|
{`test = "demo""`, true, "[]"},
|
||||||
|
{`test = ""demo""`, true, "[]"},
|
||||||
|
{`test = ''demo''`, true, "[]"},
|
||||||
|
{"test = `demo`", true, "[]"},
|
||||||
|
// comments
|
||||||
|
{"test = / demo", true, "[]"},
|
||||||
|
{"test = // demo", true, "[]"},
|
||||||
|
{"// demo", true, "[]"},
|
||||||
|
{"test = 123 // demo", false, "[{{{<nil> identifier test} = {<nil> number 123}} &&}]"},
|
||||||
|
{"test = // demo\n123", false, "[{{{<nil> identifier test} = {<nil> number 123}} &&}]"},
|
||||||
|
{`
|
||||||
|
a = 123 &&
|
||||||
|
// demo
|
||||||
|
b = 456
|
||||||
|
`, false, "[{{{<nil> identifier a} = {<nil> number 123}} &&} {{{<nil> identifier b} = {<nil> number 456}} &&}]"},
|
||||||
|
// functions
|
||||||
|
{`test() = 12`, false, `[{{{[] function test} = {<nil> number 12}} &&}]`},
|
||||||
|
{`(a.b.c(1) = d.e.f(2)) || 1=2`, false, `[{[{{{[{<nil> number 1}] function a.b.c} = {[{<nil> number 2}] function d.e.f}} &&}] &&} {{{<nil> number 1} = {<nil> number 2}} ||}]`},
|
||||||
|
// valid simple expression and sign operators check
|
||||||
|
{`1=12`, false, `[{{{<nil> number 1} = {<nil> number 12}} &&}]`},
|
||||||
|
{` 1 = 12 `, false, `[{{{<nil> number 1} = {<nil> number 12}} &&}]`},
|
||||||
|
{`"demo" != test`, false, `[{{{<nil> text demo} != {<nil> identifier test}} &&}]`},
|
||||||
|
{`a~1`, false, `[{{{<nil> identifier a} ~ {<nil> number 1}} &&}]`},
|
||||||
|
{`a !~ 1`, false, `[{{{<nil> identifier a} !~ {<nil> number 1}} &&}]`},
|
||||||
|
{`test>12`, false, `[{{{<nil> identifier test} > {<nil> number 12}} &&}]`},
|
||||||
|
{`test > 12`, false, `[{{{<nil> identifier test} > {<nil> number 12}} &&}]`},
|
||||||
|
{`test >="test"`, false, `[{{{<nil> identifier test} >= {<nil> text test}} &&}]`},
|
||||||
|
{`test<@demo.test2`, false, `[{{{<nil> identifier test} < {<nil> identifier @demo.test2}} &&}]`},
|
||||||
|
{`1<="test"`, false, `[{{{<nil> number 1} <= {<nil> text test}} &&}]`},
|
||||||
|
{`1<="te'st"`, false, `[{{{<nil> number 1} <= {<nil> text te'st}} &&}]`},
|
||||||
|
{`demo='te\'st'`, false, `[{{{<nil> identifier demo} = {<nil> text te'st}} &&}]`},
|
||||||
|
{`demo="te\'st"`, false, `[{{{<nil> identifier demo} = {<nil> text te\'st}} &&}]`},
|
||||||
|
{`demo="te\"st"`, false, `[{{{<nil> identifier demo} = {<nil> text te"st}} &&}]`},
|
||||||
|
// invalid parenthesis
|
||||||
|
{`(a=1`, true, `[]`},
|
||||||
|
{`a=1)`, true, `[]`},
|
||||||
|
{`((a=1)`, true, `[]`},
|
||||||
|
{`{a=1}`, true, `[]`},
|
||||||
|
{`[a=1]`, true, `[]`},
|
||||||
|
{`((a=1 || a=2) && c=1))`, true, `[]`},
|
||||||
|
// valid parenthesis
|
||||||
|
{`()`, true, `[]`},
|
||||||
|
{`(a=1)`, false, `[{[{{{<nil> identifier a} = {<nil> number 1}} &&}] &&}]`},
|
||||||
|
{`(a="test(")`, false, `[{[{{{<nil> identifier a} = {<nil> text test(}} &&}] &&}]`},
|
||||||
|
{`(a="test)")`, false, `[{[{{{<nil> identifier a} = {<nil> text test)}} &&}] &&}]`},
|
||||||
|
{`((a=1))`, false, `[{[{[{{{<nil> identifier a} = {<nil> number 1}} &&}] &&}] &&}]`},
|
||||||
|
{`a=1 || 2!=3`, false, `[{{{<nil> identifier a} = {<nil> number 1}} &&} {{{<nil> number 2} != {<nil> number 3}} ||}]`},
|
||||||
|
{`a=1 && 2!=3`, false, `[{{{<nil> identifier a} = {<nil> number 1}} &&} {{{<nil> number 2} != {<nil> number 3}} &&}]`},
|
||||||
|
{`a=1 && 2!=3 || "b"=a`, false, `[{{{<nil> identifier a} = {<nil> number 1}} &&} {{{<nil> number 2} != {<nil> number 3}} &&} {{{<nil> text b} = {<nil> identifier a}} ||}]`},
|
||||||
|
{`(a=1 && 2!=3) || "b"=a`, false, `[{[{{{<nil> identifier a} = {<nil> number 1}} &&} {{{<nil> number 2} != {<nil> number 3}} &&}] &&} {{{<nil> text b} = {<nil> identifier a}} ||}]`},
|
||||||
|
{`((a=1 || a=2) && (c=1))`, false, `[{[{[{{{<nil> identifier a} = {<nil> number 1}} &&} {{{<nil> identifier a} = {<nil> number 2}} ||}] &&} {[{{{<nil> identifier c} = {<nil> number 1}} &&}] &&}] &&}]`},
|
||||||
|
// https://github.com/pocketbase/pocketbase/issues/5017
|
||||||
|
{`(a='"')`, false, `[{[{{{<nil> identifier a} = {<nil> text "}} &&}] &&}]`},
|
||||||
|
{`(a='\'')`, false, `[{[{{{<nil> identifier a} = {<nil> text '}} &&}] &&}]`},
|
||||||
|
{`(a="'")`, false, `[{[{{{<nil> identifier a} = {<nil> text '}} &&}] &&}]`},
|
||||||
|
{`(a="\"")`, false, `[{[{{{<nil> identifier a} = {<nil> text "}} &&}] &&}]`},
|
||||||
|
}
|
||||||
|
|
||||||
|
for i, scenario := range scenarios {
|
||||||
|
t.Run(fmt.Sprintf("s%d:%s", i, scenario.input), func(t *testing.T) {
|
||||||
|
v, err := Parse(scenario.input)
|
||||||
|
|
||||||
|
if scenario.expectedError && err == nil {
|
||||||
|
t.Fatalf("Expected error, got nil (%q)", scenario.input)
|
||||||
|
}
|
||||||
|
|
||||||
|
if !scenario.expectedError && err != nil {
|
||||||
|
t.Fatalf("Did not expect error, got %q (%q).", err, scenario.input)
|
||||||
|
}
|
||||||
|
|
||||||
|
vPrint := fmt.Sprintf("%v", v)
|
||||||
|
|
||||||
|
if vPrint != scenario.expectedPrint {
|
||||||
|
t.Fatalf("Expected %s, got %s", scenario.expectedPrint, vPrint)
|
||||||
|
}
|
||||||
|
})
|
||||||
|
}
|
||||||
|
}
|
679
scanner.go
Normal file
679
scanner.go
Normal file
|
@ -0,0 +1,679 @@
|
||||||
|
package fexpr
|
||||||
|
|
||||||
|
import (
|
||||||
|
"bytes"
|
||||||
|
"fmt"
|
||||||
|
"strings"
|
||||||
|
"unicode/utf8"
|
||||||
|
)
|
||||||
|
|
||||||
|
// eof represents a marker rune for the end of the reader.
|
||||||
|
const eof = rune(0)
|
||||||
|
|
||||||
|
// JoinOp represents a join type operator.
|
||||||
|
type JoinOp string
|
||||||
|
|
||||||
|
// supported join type operators
|
||||||
|
const (
|
||||||
|
JoinAnd JoinOp = "&&"
|
||||||
|
JoinOr JoinOp = "||"
|
||||||
|
)
|
||||||
|
|
||||||
|
// SignOp represents an expression sign operator.
|
||||||
|
type SignOp string
|
||||||
|
|
||||||
|
// supported expression sign operators
|
||||||
|
const (
|
||||||
|
SignEq SignOp = "="
|
||||||
|
SignNeq SignOp = "!="
|
||||||
|
SignLike SignOp = "~"
|
||||||
|
SignNlike SignOp = "!~"
|
||||||
|
SignLt SignOp = "<"
|
||||||
|
SignLte SignOp = "<="
|
||||||
|
SignGt SignOp = ">"
|
||||||
|
SignGte SignOp = ">="
|
||||||
|
|
||||||
|
// array/any operators
|
||||||
|
SignAnyEq SignOp = "?="
|
||||||
|
SignAnyNeq SignOp = "?!="
|
||||||
|
SignAnyLike SignOp = "?~"
|
||||||
|
SignAnyNlike SignOp = "?!~"
|
||||||
|
SignAnyLt SignOp = "?<"
|
||||||
|
SignAnyLte SignOp = "?<="
|
||||||
|
SignAnyGt SignOp = "?>"
|
||||||
|
SignAnyGte SignOp = "?>="
|
||||||
|
)
|
||||||
|
|
||||||
|
// TokenType represents a Token type.
|
||||||
|
type TokenType string
|
||||||
|
|
||||||
|
// token type constants
|
||||||
|
const (
|
||||||
|
TokenUnexpected TokenType = "unexpected"
|
||||||
|
TokenEOF TokenType = "eof"
|
||||||
|
TokenWS TokenType = "whitespace"
|
||||||
|
TokenJoin TokenType = "join"
|
||||||
|
TokenSign TokenType = "sign"
|
||||||
|
TokenIdentifier TokenType = "identifier" // variable, column name, placeholder, etc.
|
||||||
|
TokenFunction TokenType = "function" // function
|
||||||
|
TokenNumber TokenType = "number"
|
||||||
|
TokenText TokenType = "text" // ' or " quoted string
|
||||||
|
TokenGroup TokenType = "group" // groupped/nested tokens
|
||||||
|
TokenComment TokenType = "comment"
|
||||||
|
)
|
||||||
|
|
||||||
|
// Token represents a single scanned literal (one or more combined runes).
|
||||||
|
type Token struct {
|
||||||
|
Meta interface{}
|
||||||
|
Type TokenType
|
||||||
|
Literal string
|
||||||
|
}
|
||||||
|
|
||||||
|
// NewScanner creates and returns a new scanner instance loaded with the specified data.
|
||||||
|
func NewScanner(data []byte) *Scanner {
|
||||||
|
return &Scanner{
|
||||||
|
data: data,
|
||||||
|
maxFuncDepth: 3,
|
||||||
|
}
|
||||||
|
}
|
||||||
|
|
||||||
|
// Scanner represents a filter and lexical scanner.
|
||||||
|
type Scanner struct {
|
||||||
|
data []byte
|
||||||
|
pos int
|
||||||
|
maxFuncDepth int
|
||||||
|
}
|
||||||
|
|
||||||
|
// Scan reads and returns the next available token value from the scanner's buffer.
|
||||||
|
func (s *Scanner) Scan() (Token, error) {
|
||||||
|
ch := s.read()
|
||||||
|
|
||||||
|
if ch == eof {
|
||||||
|
return Token{Type: TokenEOF, Literal: ""}, nil
|
||||||
|
}
|
||||||
|
|
||||||
|
if isWhitespaceRune(ch) {
|
||||||
|
s.unread()
|
||||||
|
return s.scanWhitespace()
|
||||||
|
}
|
||||||
|
|
||||||
|
if isGroupStartRune(ch) {
|
||||||
|
s.unread()
|
||||||
|
return s.scanGroup()
|
||||||
|
}
|
||||||
|
|
||||||
|
if isIdentifierStartRune(ch) {
|
||||||
|
s.unread()
|
||||||
|
return s.scanIdentifier(s.maxFuncDepth)
|
||||||
|
}
|
||||||
|
|
||||||
|
if isNumberStartRune(ch) {
|
||||||
|
s.unread()
|
||||||
|
return s.scanNumber()
|
||||||
|
}
|
||||||
|
|
||||||
|
if isTextStartRune(ch) {
|
||||||
|
s.unread()
|
||||||
|
return s.scanText(false)
|
||||||
|
}
|
||||||
|
|
||||||
|
if isSignStartRune(ch) {
|
||||||
|
s.unread()
|
||||||
|
return s.scanSign()
|
||||||
|
}
|
||||||
|
|
||||||
|
if isJoinStartRune(ch) {
|
||||||
|
s.unread()
|
||||||
|
return s.scanJoin()
|
||||||
|
}
|
||||||
|
|
||||||
|
if isCommentStartRune(ch) {
|
||||||
|
s.unread()
|
||||||
|
return s.scanComment()
|
||||||
|
}
|
||||||
|
|
||||||
|
return Token{Type: TokenUnexpected, Literal: string(ch)}, fmt.Errorf("unexpected character %q", ch)
|
||||||
|
}
|
||||||
|
|
||||||
|
// scanWhitespace consumes all contiguous whitespace runes.
|
||||||
|
func (s *Scanner) scanWhitespace() (Token, error) {
|
||||||
|
var buf bytes.Buffer
|
||||||
|
|
||||||
|
// Reads every subsequent whitespace character into the buffer.
|
||||||
|
// Non-whitespace runes and EOF will cause the loop to exit.
|
||||||
|
for {
|
||||||
|
ch := s.read()
|
||||||
|
|
||||||
|
if ch == eof {
|
||||||
|
break
|
||||||
|
}
|
||||||
|
|
||||||
|
if !isWhitespaceRune(ch) {
|
||||||
|
s.unread()
|
||||||
|
break
|
||||||
|
}
|
||||||
|
|
||||||
|
// write the whitespace rune
|
||||||
|
buf.WriteRune(ch)
|
||||||
|
}
|
||||||
|
|
||||||
|
return Token{Type: TokenWS, Literal: buf.String()}, nil
|
||||||
|
}
|
||||||
|
|
||||||
|
// scanNumber consumes all contiguous digit runes
|
||||||
|
// (complex numbers and scientific notations are not supported).
|
||||||
|
func (s *Scanner) scanNumber() (Token, error) {
|
||||||
|
var buf bytes.Buffer
|
||||||
|
|
||||||
|
var hadDot bool
|
||||||
|
|
||||||
|
// Read every subsequent digit rune into the buffer.
|
||||||
|
// Non-digit runes and EOF will cause the loop to exit.
|
||||||
|
for {
|
||||||
|
ch := s.read()
|
||||||
|
|
||||||
|
if ch == eof {
|
||||||
|
break
|
||||||
|
}
|
||||||
|
|
||||||
|
// not a digit rune
|
||||||
|
if !isDigitRune(ch) &&
|
||||||
|
// minus sign but not at the beginning
|
||||||
|
(ch != '-' || buf.Len() != 0) &&
|
||||||
|
// dot but there was already another dot
|
||||||
|
(ch != '.' || hadDot) {
|
||||||
|
s.unread()
|
||||||
|
break
|
||||||
|
}
|
||||||
|
|
||||||
|
// write the rune
|
||||||
|
buf.WriteRune(ch)
|
||||||
|
|
||||||
|
if ch == '.' {
|
||||||
|
hadDot = true
|
||||||
|
}
|
||||||
|
}
|
||||||
|
|
||||||
|
total := buf.Len()
|
||||||
|
literal := buf.String()
|
||||||
|
|
||||||
|
var err error
|
||||||
|
// only "-" or starts with "." or ends with "."
|
||||||
|
if (total == 1 && literal[0] == '-') || literal[0] == '.' || literal[total-1] == '.' {
|
||||||
|
err = fmt.Errorf("invalid number %q", literal)
|
||||||
|
}
|
||||||
|
|
||||||
|
return Token{Type: TokenNumber, Literal: buf.String()}, err
|
||||||
|
}
|
||||||
|
|
||||||
|
// scanText consumes all contiguous quoted text runes.
|
||||||
|
func (s *Scanner) scanText(preserveQuotes bool) (Token, error) {
|
||||||
|
var buf bytes.Buffer
|
||||||
|
|
||||||
|
// read the first rune to determine the quotes type
|
||||||
|
firstCh := s.read()
|
||||||
|
buf.WriteRune(firstCh)
|
||||||
|
var prevCh rune
|
||||||
|
var hasMatchingQuotes bool
|
||||||
|
|
||||||
|
// Read every subsequent text rune into the buffer.
|
||||||
|
// EOF and matching unescaped ending quote will cause the loop to exit.
|
||||||
|
for {
|
||||||
|
ch := s.read()
|
||||||
|
|
||||||
|
if ch == eof {
|
||||||
|
break
|
||||||
|
}
|
||||||
|
|
||||||
|
// write the text rune
|
||||||
|
buf.WriteRune(ch)
|
||||||
|
|
||||||
|
// unescaped matching quote, aka. the end
|
||||||
|
if ch == firstCh && prevCh != '\\' {
|
||||||
|
hasMatchingQuotes = true
|
||||||
|
break
|
||||||
|
}
|
||||||
|
|
||||||
|
prevCh = ch
|
||||||
|
}
|
||||||
|
|
||||||
|
literal := buf.String()
|
||||||
|
|
||||||
|
var err error
|
||||||
|
if !hasMatchingQuotes {
|
||||||
|
err = fmt.Errorf("invalid quoted text %q", literal)
|
||||||
|
} else if !preserveQuotes {
|
||||||
|
// unquote
|
||||||
|
literal = literal[1 : len(literal)-1]
|
||||||
|
// remove escaped quotes prefix (aka. \)
|
||||||
|
firstChStr := string(firstCh)
|
||||||
|
literal = strings.ReplaceAll(literal, `\`+firstChStr, firstChStr)
|
||||||
|
}
|
||||||
|
|
||||||
|
return Token{Type: TokenText, Literal: literal}, err
|
||||||
|
}
|
||||||
|
|
||||||
|
// scanComment consumes all contiguous single line comment runes until
|
||||||
|
// a new character (\n) or EOF is reached.
|
||||||
|
func (s *Scanner) scanComment() (Token, error) {
|
||||||
|
var buf bytes.Buffer
|
||||||
|
|
||||||
|
// Read the first 2 characters without writting them to the buffer.
|
||||||
|
if !isCommentStartRune(s.read()) || !isCommentStartRune(s.read()) {
|
||||||
|
return Token{Type: TokenComment}, ErrInvalidComment
|
||||||
|
}
|
||||||
|
|
||||||
|
// Read every subsequent comment text rune into the buffer.
|
||||||
|
// \n and EOF will cause the loop to exit.
|
||||||
|
for i := 0; ; i++ {
|
||||||
|
ch := s.read()
|
||||||
|
|
||||||
|
if ch == eof || ch == '\n' {
|
||||||
|
break
|
||||||
|
}
|
||||||
|
|
||||||
|
buf.WriteRune(ch)
|
||||||
|
}
|
||||||
|
|
||||||
|
return Token{Type: TokenComment, Literal: strings.TrimSpace(buf.String())}, nil
|
||||||
|
}
|
||||||
|
|
||||||
|
// scanIdentifier consumes all contiguous ident runes.
|
||||||
|
func (s *Scanner) scanIdentifier(funcDepth int) (Token, error) {
|
||||||
|
var buf bytes.Buffer
|
||||||
|
|
||||||
|
// read the first rune in case it is a special start identifier character
|
||||||
|
buf.WriteRune(s.read())
|
||||||
|
|
||||||
|
// Read every subsequent identifier rune into the buffer.
|
||||||
|
// Non-ident runes and EOF will cause the loop to exit.
|
||||||
|
for {
|
||||||
|
ch := s.read()
|
||||||
|
|
||||||
|
if ch == eof {
|
||||||
|
break
|
||||||
|
}
|
||||||
|
|
||||||
|
// func
|
||||||
|
if ch == '(' {
|
||||||
|
funcName := buf.String()
|
||||||
|
if funcDepth <= 0 {
|
||||||
|
return Token{Type: TokenFunction, Literal: funcName}, fmt.Errorf("max nested function arguments reached (max: %d)", s.maxFuncDepth)
|
||||||
|
}
|
||||||
|
if !isValidIdentifier(funcName) {
|
||||||
|
return Token{Type: TokenFunction, Literal: funcName}, fmt.Errorf("invalid function name %q", funcName)
|
||||||
|
}
|
||||||
|
s.unread()
|
||||||
|
return s.scanFunctionArgs(funcName, funcDepth)
|
||||||
|
}
|
||||||
|
|
||||||
|
// not an identifier character
|
||||||
|
if !isLetterRune(ch) && !isDigitRune(ch) && !isIdentifierCombineRune(ch) && ch != '_' {
|
||||||
|
s.unread()
|
||||||
|
break
|
||||||
|
}
|
||||||
|
|
||||||
|
// write the identifier rune
|
||||||
|
buf.WriteRune(ch)
|
||||||
|
}
|
||||||
|
|
||||||
|
literal := buf.String()
|
||||||
|
|
||||||
|
var err error
|
||||||
|
if !isValidIdentifier(literal) {
|
||||||
|
err = fmt.Errorf("invalid identifier %q", literal)
|
||||||
|
}
|
||||||
|
|
||||||
|
return Token{Type: TokenIdentifier, Literal: literal}, err
|
||||||
|
}
|
||||||
|
|
||||||
|
// scanSign consumes all contiguous sign operator runes.
|
||||||
|
func (s *Scanner) scanSign() (Token, error) {
|
||||||
|
var buf bytes.Buffer
|
||||||
|
|
||||||
|
// Read every subsequent sign rune into the buffer.
|
||||||
|
// Non-sign runes and EOF will cause the loop to exit.
|
||||||
|
for {
|
||||||
|
ch := s.read()
|
||||||
|
|
||||||
|
if ch == eof {
|
||||||
|
break
|
||||||
|
}
|
||||||
|
|
||||||
|
if !isSignStartRune(ch) {
|
||||||
|
s.unread()
|
||||||
|
break
|
||||||
|
}
|
||||||
|
|
||||||
|
// write the sign rune
|
||||||
|
buf.WriteRune(ch)
|
||||||
|
}
|
||||||
|
|
||||||
|
literal := buf.String()
|
||||||
|
|
||||||
|
var err error
|
||||||
|
if !isSignOperator(literal) {
|
||||||
|
err = fmt.Errorf("invalid sign operator %q", literal)
|
||||||
|
}
|
||||||
|
|
||||||
|
return Token{Type: TokenSign, Literal: literal}, err
|
||||||
|
}
|
||||||
|
|
||||||
|
// scanJoin consumes all contiguous join operator runes.
|
||||||
|
func (s *Scanner) scanJoin() (Token, error) {
|
||||||
|
var buf bytes.Buffer
|
||||||
|
|
||||||
|
// Read every subsequent join operator rune into the buffer.
|
||||||
|
// Non-join runes and EOF will cause the loop to exit.
|
||||||
|
for {
|
||||||
|
ch := s.read()
|
||||||
|
|
||||||
|
if ch == eof {
|
||||||
|
break
|
||||||
|
}
|
||||||
|
|
||||||
|
if !isJoinStartRune(ch) {
|
||||||
|
s.unread()
|
||||||
|
break
|
||||||
|
}
|
||||||
|
|
||||||
|
// write the join operator rune
|
||||||
|
buf.WriteRune(ch)
|
||||||
|
}
|
||||||
|
|
||||||
|
literal := buf.String()
|
||||||
|
|
||||||
|
var err error
|
||||||
|
if !isJoinOperator(literal) {
|
||||||
|
err = fmt.Errorf("invalid join operator %q", literal)
|
||||||
|
}
|
||||||
|
|
||||||
|
return Token{Type: TokenJoin, Literal: literal}, err
|
||||||
|
}
|
||||||
|
|
||||||
|
// scanGroup consumes all runes within a group/parenthesis.
|
||||||
|
func (s *Scanner) scanGroup() (Token, error) {
|
||||||
|
var buf bytes.Buffer
|
||||||
|
|
||||||
|
// read the first group bracket without writing it to the buffer
|
||||||
|
firstChar := s.read()
|
||||||
|
openGroups := 1
|
||||||
|
|
||||||
|
// Read every subsequent text rune into the buffer.
|
||||||
|
// EOF and matching unescaped ending quote will cause the loop to exit.
|
||||||
|
for {
|
||||||
|
ch := s.read()
|
||||||
|
|
||||||
|
if ch == eof {
|
||||||
|
break
|
||||||
|
}
|
||||||
|
|
||||||
|
if isGroupStartRune(ch) {
|
||||||
|
// nested group
|
||||||
|
openGroups++
|
||||||
|
buf.WriteRune(ch)
|
||||||
|
} else if isTextStartRune(ch) {
|
||||||
|
s.unread()
|
||||||
|
t, err := s.scanText(true) // with quotes to preserve the exact text start/end runes
|
||||||
|
if err != nil {
|
||||||
|
// write the errored literal as it is
|
||||||
|
buf.WriteString(t.Literal)
|
||||||
|
return Token{Type: TokenGroup, Literal: buf.String()}, err
|
||||||
|
}
|
||||||
|
|
||||||
|
buf.WriteString(t.Literal)
|
||||||
|
} else if ch == ')' {
|
||||||
|
openGroups--
|
||||||
|
|
||||||
|
if openGroups <= 0 {
|
||||||
|
// main group end
|
||||||
|
break
|
||||||
|
} else {
|
||||||
|
buf.WriteRune(ch)
|
||||||
|
}
|
||||||
|
} else {
|
||||||
|
buf.WriteRune(ch)
|
||||||
|
}
|
||||||
|
}
|
||||||
|
|
||||||
|
literal := buf.String()
|
||||||
|
|
||||||
|
var err error
|
||||||
|
if !isGroupStartRune(firstChar) || openGroups > 0 {
|
||||||
|
err = fmt.Errorf("invalid formatted group - missing %d closing bracket(s)", openGroups)
|
||||||
|
}
|
||||||
|
|
||||||
|
return Token{Type: TokenGroup, Literal: literal}, err
|
||||||
|
}
|
||||||
|
|
||||||
|
// scanFunctionArgs consumes all contiguous function call runes to
|
||||||
|
// extract its arguments and returns a function token with the found
|
||||||
|
// Token arguments loaded in Token.Meta.
|
||||||
|
func (s *Scanner) scanFunctionArgs(funcName string, funcDepth int) (Token, error) {
|
||||||
|
var args []Token
|
||||||
|
|
||||||
|
var expectComma, isComma, isClosed bool
|
||||||
|
|
||||||
|
ch := s.read()
|
||||||
|
if ch != '(' {
|
||||||
|
return Token{Type: TokenFunction, Literal: funcName}, fmt.Errorf("invalid or incomplete function call %q", funcName)
|
||||||
|
}
|
||||||
|
|
||||||
|
// Read every subsequent rune until ')' or EOF has been reached.
|
||||||
|
for {
|
||||||
|
ch := s.read()
|
||||||
|
|
||||||
|
if ch == eof {
|
||||||
|
break
|
||||||
|
}
|
||||||
|
|
||||||
|
if ch == ')' {
|
||||||
|
isClosed = true
|
||||||
|
break
|
||||||
|
}
|
||||||
|
|
||||||
|
// skip whitespaces
|
||||||
|
if isWhitespaceRune(ch) {
|
||||||
|
_, err := s.scanWhitespace()
|
||||||
|
if err != nil {
|
||||||
|
return Token{Type: TokenFunction, Literal: funcName, Meta: args}, fmt.Errorf("failed to scan whitespaces in function %q: %w", funcName, err)
|
||||||
|
}
|
||||||
|
continue
|
||||||
|
}
|
||||||
|
|
||||||
|
// skip comments
|
||||||
|
if isCommentStartRune(ch) {
|
||||||
|
s.unread()
|
||||||
|
_, err := s.scanComment()
|
||||||
|
if err != nil {
|
||||||
|
return Token{Type: TokenFunction, Literal: funcName, Meta: args}, fmt.Errorf("failed to scan comment in function %q: %w", funcName, err)
|
||||||
|
}
|
||||||
|
continue
|
||||||
|
}
|
||||||
|
|
||||||
|
isComma = ch == ','
|
||||||
|
|
||||||
|
if expectComma && !isComma {
|
||||||
|
return Token{Type: TokenFunction, Literal: funcName, Meta: args}, fmt.Errorf("expected comma after the last argument in function %q", funcName)
|
||||||
|
}
|
||||||
|
|
||||||
|
if !expectComma && isComma {
|
||||||
|
return Token{Type: TokenFunction, Literal: funcName, Meta: args}, fmt.Errorf("unexpected comma in function %q", funcName)
|
||||||
|
}
|
||||||
|
|
||||||
|
expectComma = false // reset
|
||||||
|
|
||||||
|
if isComma {
|
||||||
|
continue
|
||||||
|
}
|
||||||
|
|
||||||
|
if isIdentifierStartRune(ch) {
|
||||||
|
s.unread()
|
||||||
|
t, err := s.scanIdentifier(funcDepth - 1)
|
||||||
|
if err != nil {
|
||||||
|
return Token{Type: TokenFunction, Literal: funcName, Meta: args}, fmt.Errorf("invalid identifier argument %q in function %q: %w", t.Literal, funcName, err)
|
||||||
|
}
|
||||||
|
args = append(args, t)
|
||||||
|
expectComma = true
|
||||||
|
} else if isNumberStartRune(ch) {
|
||||||
|
s.unread()
|
||||||
|
t, err := s.scanNumber()
|
||||||
|
if err != nil {
|
||||||
|
return Token{Type: TokenFunction, Literal: funcName, Meta: args}, fmt.Errorf("invalid number argument %q in function %q: %w", t.Literal, funcName, err)
|
||||||
|
}
|
||||||
|
args = append(args, t)
|
||||||
|
expectComma = true
|
||||||
|
} else if isTextStartRune(ch) {
|
||||||
|
s.unread()
|
||||||
|
t, err := s.scanText(false)
|
||||||
|
if err != nil {
|
||||||
|
return Token{Type: TokenFunction, Literal: funcName, Meta: args}, fmt.Errorf("invalid text argument %q in function %q: %w", t.Literal, funcName, err)
|
||||||
|
}
|
||||||
|
args = append(args, t)
|
||||||
|
expectComma = true
|
||||||
|
} else {
|
||||||
|
return Token{Type: TokenFunction, Literal: funcName, Meta: args}, fmt.Errorf("unsupported argument character %q in function %q", ch, funcName)
|
||||||
|
}
|
||||||
|
}
|
||||||
|
|
||||||
|
if !isClosed {
|
||||||
|
return Token{Type: TokenFunction, Literal: funcName, Meta: args}, fmt.Errorf("invalid or incomplete function %q (expected ')')", funcName)
|
||||||
|
}
|
||||||
|
|
||||||
|
return Token{Type: TokenFunction, Literal: funcName, Meta: args}, nil
|
||||||
|
}
|
||||||
|
|
||||||
|
// unread unreads the last character and revert the position 1 step back.
|
||||||
|
func (s *Scanner) unread() {
|
||||||
|
if s.pos > 0 {
|
||||||
|
s.pos = s.pos - 1
|
||||||
|
}
|
||||||
|
}
|
||||||
|
|
||||||
|
// read reads the next rune and moves the position forward.
|
||||||
|
func (s *Scanner) read() rune {
|
||||||
|
if s.pos >= len(s.data) {
|
||||||
|
return eof
|
||||||
|
}
|
||||||
|
|
||||||
|
ch, n := utf8.DecodeRune(s.data[s.pos:])
|
||||||
|
s.pos += n
|
||||||
|
|
||||||
|
return ch
|
||||||
|
}
|
||||||
|
|
||||||
|
// Lexical helpers:
|
||||||
|
// -------------------------------------------------------------------
|
||||||
|
|
||||||
|
// isWhitespaceRune checks if a rune is a space, tab, or newline.
|
||||||
|
func isWhitespaceRune(ch rune) bool { return ch == ' ' || ch == '\t' || ch == '\n' }
|
||||||
|
|
||||||
|
// isLetterRune checks if a rune is a letter.
|
||||||
|
func isLetterRune(ch rune) bool {
|
||||||
|
return (ch >= 'a' && ch <= 'z') || (ch >= 'A' && ch <= 'Z')
|
||||||
|
}
|
||||||
|
|
||||||
|
// isDigitRune checks if a rune is a digit.
|
||||||
|
func isDigitRune(ch rune) bool {
|
||||||
|
return (ch >= '0' && ch <= '9')
|
||||||
|
}
|
||||||
|
|
||||||
|
// isTextStartRune checks if a rune is a valid quoted text first character
|
||||||
|
// (aka. single or double quote).
|
||||||
|
func isTextStartRune(ch rune) bool {
|
||||||
|
return ch == '\'' || ch == '"'
|
||||||
|
}
|
||||||
|
|
||||||
|
// isNumberStartRune checks if a rune is a valid number start character (aka. digit).
|
||||||
|
func isNumberStartRune(ch rune) bool {
|
||||||
|
return ch == '-' || isDigitRune(ch)
|
||||||
|
}
|
||||||
|
|
||||||
|
// isSignStartRune checks if a rune is a valid sign operator start character.
|
||||||
|
func isSignStartRune(ch rune) bool {
|
||||||
|
return ch == '=' ||
|
||||||
|
ch == '?' ||
|
||||||
|
ch == '!' ||
|
||||||
|
ch == '>' ||
|
||||||
|
ch == '<' ||
|
||||||
|
ch == '~'
|
||||||
|
}
|
||||||
|
|
||||||
|
// isJoinStartRune checks if a rune is a valid join type start character.
|
||||||
|
func isJoinStartRune(ch rune) bool {
|
||||||
|
return ch == '&' || ch == '|'
|
||||||
|
}
|
||||||
|
|
||||||
|
// isGroupStartRune checks if a rune is a valid group/parenthesis start character.
|
||||||
|
func isGroupStartRune(ch rune) bool {
|
||||||
|
return ch == '('
|
||||||
|
}
|
||||||
|
|
||||||
|
// isCommentStartRune checks if a rune is a valid comment start character.
|
||||||
|
func isCommentStartRune(ch rune) bool {
|
||||||
|
return ch == '/'
|
||||||
|
}
|
||||||
|
|
||||||
|
// isIdentifierStartRune checks if a rune is valid identifier's first character.
|
||||||
|
func isIdentifierStartRune(ch rune) bool {
|
||||||
|
return isLetterRune(ch) || isIdentifierSpecialStartRune(ch)
|
||||||
|
}
|
||||||
|
|
||||||
|
// isIdentifierSpecialStartRune checks if a rune is valid identifier's first special character.
|
||||||
|
func isIdentifierSpecialStartRune(ch rune) bool {
|
||||||
|
return ch == '@' || ch == '_' || ch == '#'
|
||||||
|
}
|
||||||
|
|
||||||
|
// isIdentifierCombineRune checks if a rune is valid identifier's combine character.
|
||||||
|
func isIdentifierCombineRune(ch rune) bool {
|
||||||
|
return ch == '.' || ch == ':'
|
||||||
|
}
|
||||||
|
|
||||||
|
// isSignOperator checks if a literal is a valid sign operator.
|
||||||
|
func isSignOperator(literal string) bool {
|
||||||
|
switch SignOp(literal) {
|
||||||
|
case
|
||||||
|
SignEq,
|
||||||
|
SignNeq,
|
||||||
|
SignLt,
|
||||||
|
SignLte,
|
||||||
|
SignGt,
|
||||||
|
SignGte,
|
||||||
|
SignLike,
|
||||||
|
SignNlike,
|
||||||
|
SignAnyEq,
|
||||||
|
SignAnyNeq,
|
||||||
|
SignAnyLike,
|
||||||
|
SignAnyNlike,
|
||||||
|
SignAnyLt,
|
||||||
|
SignAnyLte,
|
||||||
|
SignAnyGt,
|
||||||
|
SignAnyGte:
|
||||||
|
return true
|
||||||
|
}
|
||||||
|
|
||||||
|
return false
|
||||||
|
}
|
||||||
|
|
||||||
|
// isJoinOperator checks if a literal is a valid join type operator.
|
||||||
|
func isJoinOperator(literal string) bool {
|
||||||
|
switch JoinOp(literal) {
|
||||||
|
case
|
||||||
|
JoinAnd,
|
||||||
|
JoinOr:
|
||||||
|
return true
|
||||||
|
}
|
||||||
|
|
||||||
|
return false
|
||||||
|
}
|
||||||
|
|
||||||
|
// isValidIdentifier validates the literal against common identifier requirements.
|
||||||
|
func isValidIdentifier(literal string) bool {
|
||||||
|
length := len(literal)
|
||||||
|
|
||||||
|
return (
|
||||||
|
// doesn't end with combine rune
|
||||||
|
!isIdentifierCombineRune(rune(literal[length-1])) &&
|
||||||
|
// is not just a special start rune
|
||||||
|
(length != 1 || !isIdentifierSpecialStartRune(rune(literal[0]))))
|
||||||
|
}
|
166
scanner_test.go
Normal file
166
scanner_test.go
Normal file
|
@ -0,0 +1,166 @@
|
||||||
|
package fexpr
|
||||||
|
|
||||||
|
import (
|
||||||
|
"fmt"
|
||||||
|
"testing"
|
||||||
|
)
|
||||||
|
|
||||||
|
func TestNewScanner(t *testing.T) {
|
||||||
|
s := NewScanner([]byte("test"))
|
||||||
|
|
||||||
|
data := string(s.data)
|
||||||
|
|
||||||
|
if data != "test" {
|
||||||
|
t.Errorf("Expected the scanner reader data to be %q, got %q", "test", data)
|
||||||
|
}
|
||||||
|
}
|
||||||
|
|
||||||
|
func TestScannerScan(t *testing.T) {
|
||||||
|
type output struct {
|
||||||
|
error bool
|
||||||
|
print string
|
||||||
|
}
|
||||||
|
testScenarios := []struct {
|
||||||
|
text string
|
||||||
|
expects []output
|
||||||
|
}{
|
||||||
|
// whitespace
|
||||||
|
{" ", []output{{false, "{<nil> whitespace }"}}},
|
||||||
|
{"test 123", []output{{false, "{<nil> identifier test}"}, {false, "{<nil> whitespace }"}, {false, "{<nil> number 123}"}}},
|
||||||
|
// identifier
|
||||||
|
{`test`, []output{{false, `{<nil> identifier test}`}}},
|
||||||
|
{`@`, []output{{true, `{<nil> identifier @}`}}},
|
||||||
|
{`test:`, []output{{true, `{<nil> identifier test:}`}}},
|
||||||
|
{`test.`, []output{{true, `{<nil> identifier test.}`}}},
|
||||||
|
{`@test.123:c`, []output{{false, `{<nil> identifier @test.123:c}`}}},
|
||||||
|
{`_test_a.123`, []output{{false, `{<nil> identifier _test_a.123}`}}},
|
||||||
|
{`#test.123:456`, []output{{false, `{<nil> identifier #test.123:456}`}}},
|
||||||
|
{`.test.123`, []output{{true, `{<nil> unexpected .}`}, {false, `{<nil> identifier test.123}`}}},
|
||||||
|
{`:test.123`, []output{{true, `{<nil> unexpected :}`}, {false, `{<nil> identifier test.123}`}}},
|
||||||
|
{`test#@`, []output{{false, `{<nil> identifier test}`}, {true, `{<nil> identifier #}`}, {true, `{<nil> identifier @}`}}},
|
||||||
|
{`test'`, []output{{false, `{<nil> identifier test}`}, {true, `{<nil> text '}`}}},
|
||||||
|
{`test"d`, []output{{false, `{<nil> identifier test}`}, {true, `{<nil> text "d}`}}},
|
||||||
|
// number
|
||||||
|
{`123`, []output{{false, `{<nil> number 123}`}}},
|
||||||
|
{`-123`, []output{{false, `{<nil> number -123}`}}},
|
||||||
|
{`-123.456`, []output{{false, `{<nil> number -123.456}`}}},
|
||||||
|
{`123.456`, []output{{false, `{<nil> number 123.456}`}}},
|
||||||
|
{`12.34.56`, []output{{false, `{<nil> number 12.34}`}, {true, `{<nil> unexpected .}`}, {false, `{<nil> number 56}`}}},
|
||||||
|
{`.123`, []output{{true, `{<nil> unexpected .}`}, {false, `{<nil> number 123}`}}},
|
||||||
|
{`- 123`, []output{{true, `{<nil> number -}`}, {false, `{<nil> whitespace }`}, {false, `{<nil> number 123}`}}},
|
||||||
|
{`12-3`, []output{{false, `{<nil> number 12}`}, {false, `{<nil> number -3}`}}},
|
||||||
|
{`123.abc`, []output{{true, `{<nil> number 123.}`}, {false, `{<nil> identifier abc}`}}},
|
||||||
|
// text
|
||||||
|
{`""`, []output{{false, `{<nil> text }`}}},
|
||||||
|
{`''`, []output{{false, `{<nil> text }`}}},
|
||||||
|
{`'test'`, []output{{false, `{<nil> text test}`}}},
|
||||||
|
{`'te\'st'`, []output{{false, `{<nil> text te'st}`}}},
|
||||||
|
{`"te\"st"`, []output{{false, `{<nil> text te"st}`}}},
|
||||||
|
{`"tes@#,;!@#%^'\"t"`, []output{{false, `{<nil> text tes@#,;!@#%^'"t}`}}},
|
||||||
|
{`'tes@#,;!@#%^\'"t'`, []output{{false, `{<nil> text tes@#,;!@#%^'"t}`}}},
|
||||||
|
{`"test`, []output{{true, `{<nil> text "test}`}}},
|
||||||
|
{`'test`, []output{{true, `{<nil> text 'test}`}}},
|
||||||
|
{`'АБЦ`, []output{{true, `{<nil> text 'АБЦ}`}}},
|
||||||
|
// join types
|
||||||
|
{`&&||`, []output{{true, `{<nil> join &&||}`}}},
|
||||||
|
{`&& ||`, []output{{false, `{<nil> join &&}`}, {false, `{<nil> whitespace }`}, {false, `{<nil> join ||}`}}},
|
||||||
|
{`'||test&&'&&123`, []output{{false, `{<nil> text ||test&&}`}, {false, `{<nil> join &&}`}, {false, `{<nil> number 123}`}}},
|
||||||
|
// expression signs
|
||||||
|
{`=!=`, []output{{true, `{<nil> sign =!=}`}}},
|
||||||
|
{`= != ~ !~ > >= < <= ?= ?!= ?~ ?!~ ?> ?>= ?< ?<=`, []output{
|
||||||
|
{false, `{<nil> sign =}`},
|
||||||
|
{false, `{<nil> whitespace }`},
|
||||||
|
{false, `{<nil> sign !=}`},
|
||||||
|
{false, `{<nil> whitespace }`},
|
||||||
|
{false, `{<nil> sign ~}`},
|
||||||
|
{false, `{<nil> whitespace }`},
|
||||||
|
{false, `{<nil> sign !~}`},
|
||||||
|
{false, `{<nil> whitespace }`},
|
||||||
|
{false, `{<nil> sign >}`},
|
||||||
|
{false, `{<nil> whitespace }`},
|
||||||
|
{false, `{<nil> sign >=}`},
|
||||||
|
{false, `{<nil> whitespace }`},
|
||||||
|
{false, `{<nil> sign <}`},
|
||||||
|
{false, `{<nil> whitespace }`},
|
||||||
|
{false, `{<nil> sign <=}`},
|
||||||
|
{false, `{<nil> whitespace }`},
|
||||||
|
{false, `{<nil> sign ?=}`},
|
||||||
|
{false, `{<nil> whitespace }`},
|
||||||
|
{false, `{<nil> sign ?!=}`},
|
||||||
|
{false, `{<nil> whitespace }`},
|
||||||
|
{false, `{<nil> sign ?~}`},
|
||||||
|
{false, `{<nil> whitespace }`},
|
||||||
|
{false, `{<nil> sign ?!~}`},
|
||||||
|
{false, `{<nil> whitespace }`},
|
||||||
|
{false, `{<nil> sign ?>}`},
|
||||||
|
{false, `{<nil> whitespace }`},
|
||||||
|
{false, `{<nil> sign ?>=}`},
|
||||||
|
{false, `{<nil> whitespace }`},
|
||||||
|
{false, `{<nil> sign ?<}`},
|
||||||
|
{false, `{<nil> whitespace }`},
|
||||||
|
{false, `{<nil> sign ?<=}`},
|
||||||
|
}},
|
||||||
|
// comments
|
||||||
|
{`/ test`, []output{{true, `{<nil> comment }`}, {false, `{<nil> identifier test}`}}},
|
||||||
|
{`/ / test`, []output{{true, `{<nil> comment }`}, {true, `{<nil> comment }`}, {false, `{<nil> identifier test}`}}},
|
||||||
|
{`//`, []output{{false, `{<nil> comment }`}}},
|
||||||
|
{`//test`, []output{{false, `{<nil> comment test}`}}},
|
||||||
|
{`// test`, []output{{false, `{<nil> comment test}`}}},
|
||||||
|
{`// test1 //test2 `, []output{{false, `{<nil> comment test1 //test2}`}}},
|
||||||
|
{`///test`, []output{{false, `{<nil> comment /test}`}}},
|
||||||
|
// funcs
|
||||||
|
{`test()`, []output{{false, `{[] function test}`}}},
|
||||||
|
{`test(a, b`, []output{{true, `{[{<nil> identifier a} {<nil> identifier b}] function test}`}}},
|
||||||
|
{`@test:abc()`, []output{{false, `{[] function @test:abc}`}}},
|
||||||
|
{`test( a )`, []output{{false, `{[{<nil> identifier a}] function test}`}}}, // with whitespaces
|
||||||
|
{`test(a, b)`, []output{{false, `{[{<nil> identifier a} {<nil> identifier b}] function test}`}}},
|
||||||
|
{`test(a, b, )`, []output{{false, `{[{<nil> identifier a} {<nil> identifier b}] function test}`}}}, // single trailing comma
|
||||||
|
{`test(a,,)`, []output{{true, `{[{<nil> identifier a}] function test}`}, {true, `{<nil> unexpected )}`}}}, // unexpected trailing commas
|
||||||
|
{`test(a,,,b)`, []output{{true, `{[{<nil> identifier a}] function test}`}, {true, `{<nil> unexpected ,}`}, {false, `{<nil> identifier b}`}, {true, `{<nil> unexpected )}`}}}, // unexpected mid-args commas
|
||||||
|
{`test( @test.a.b:test , 123, "ab)c", 'd,ce', false)`, []output{{false, `{[{<nil> identifier @test.a.b:test} {<nil> number 123} {<nil> text ab)c} {<nil> text d,ce} {<nil> identifier false}] function test}`}}},
|
||||||
|
{"test(a //test)", []output{{true, `{[{<nil> identifier a}] function test}`}}}, // invalid simple comment
|
||||||
|
{"test(a //test\n)", []output{{false, `{[{<nil> identifier a}] function test}`}}}, // valid simple comment
|
||||||
|
{"test(a, //test\n, b)", []output{{true, `{[{<nil> identifier a}] function test}`}, {false, `{<nil> whitespace }`}, {false, `{<nil> identifier b}`}, {true, `{<nil> unexpected )}`}}},
|
||||||
|
{"test(a, //test\n b)", []output{{false, `{[{<nil> identifier a} {<nil> identifier b}] function test}`}}},
|
||||||
|
{"test(a, test(test(b), c), d)", []output{{false, `{[{<nil> identifier a} {[{[{<nil> identifier b}] function test} {<nil> identifier c}] function test} {<nil> identifier d}] function test}`}}},
|
||||||
|
// max funcs depth
|
||||||
|
{"a(b(c(1)))", []output{{false, `{[{[{[{<nil> number 1}] function c}] function b}] function a}`}}},
|
||||||
|
{"a(b(c(d(1))))", []output{{true, `{[] function a}`}, {false, `{<nil> number 1}`}, {true, `{<nil> unexpected )}`}, {true, `{<nil> unexpected )}`}, {true, `{<nil> unexpected )}`}, {true, `{<nil> unexpected )}`}}},
|
||||||
|
// groups/parenthesis
|
||||||
|
{`a)`, []output{{false, `{<nil> identifier a}`}, {true, `{<nil> unexpected )}`}}},
|
||||||
|
{`(a b c`, []output{{true, `{<nil> group a b c}`}}},
|
||||||
|
{`(a b c)`, []output{{false, `{<nil> group a b c}`}}},
|
||||||
|
{`((a b c))`, []output{{false, `{<nil> group (a b c)}`}}},
|
||||||
|
{`((a )b c))`, []output{{false, `{<nil> group (a )b c}`}, {true, `{<nil> unexpected )}`}}},
|
||||||
|
{`("ab)("c)`, []output{{false, `{<nil> group "ab)("c}`}}},
|
||||||
|
{`("ab)(c)`, []output{{true, `{<nil> group "ab)(c)}`}}},
|
||||||
|
{`( func(1, 2, 3, func(4)) a b c )`, []output{{false, `{<nil> group func(1, 2, 3, func(4)) a b c }`}}},
|
||||||
|
}
|
||||||
|
|
||||||
|
for _, scenario := range testScenarios {
|
||||||
|
t.Run(scenario.text, func(t *testing.T) {
|
||||||
|
s := NewScanner([]byte(scenario.text))
|
||||||
|
|
||||||
|
// scan the text tokens
|
||||||
|
for j, expect := range scenario.expects {
|
||||||
|
token, err := s.Scan()
|
||||||
|
|
||||||
|
hasErr := err != nil
|
||||||
|
if expect.error != hasErr {
|
||||||
|
t.Errorf("[%d] Expected hasErr %v, got %v: %v (%v)", j, expect.error, hasErr, err, token)
|
||||||
|
}
|
||||||
|
|
||||||
|
tokenPrint := fmt.Sprintf("%v", token)
|
||||||
|
if tokenPrint != expect.print {
|
||||||
|
t.Errorf("[%d] Expected token %s, got %s", j, expect.print, tokenPrint)
|
||||||
|
}
|
||||||
|
}
|
||||||
|
|
||||||
|
// the last remaining token should be the eof
|
||||||
|
lastToken, err := s.Scan()
|
||||||
|
if err != nil || lastToken.Type != TokenEOF {
|
||||||
|
t.Fatalf("Expected EOF token, got %v (%v)", lastToken, err)
|
||||||
|
}
|
||||||
|
})
|
||||||
|
}
|
||||||
|
}
|
Loading…
Add table
Add a link
Reference in a new issue