1
0
Fork 0

Merging upstream version 25.7.1.

Signed-off-by: Daniel Baumann <daniel@debian.org>
This commit is contained in:
Daniel Baumann 2025-02-13 21:51:42 +01:00
parent dba379232c
commit aa0eae236a
Signed by: daniel
GPG key ID: FBB4F0E80A80222F
102 changed files with 52995 additions and 52070 deletions

View file

@ -1,6 +1,54 @@
Changelog Changelog
========= =========
## [v25.7.0] - 2024-07-25
### :sparkles: New Features
- [`ba0aa50`](https://github.com/tobymao/sqlglot/commit/ba0aa50072f623c299eb4d2dbb69993541fff27b) - **duckdb**: Transpile BQ's exp.DatetimeAdd, exp.DatetimeSub *(PR [#3777](https://github.com/tobymao/sqlglot/pull/3777) by [@VaggelisD](https://github.com/VaggelisD))*
- [`5da91fb`](https://github.com/tobymao/sqlglot/commit/5da91fb50d0f8029ddda16040ebd316c1a651e2d) - **postgres**: Support for CREATE INDEX CONCURRENTLY *(PR [#3787](https://github.com/tobymao/sqlglot/pull/3787) by [@VaggelisD](https://github.com/VaggelisD))*
- :arrow_lower_right: *addresses issue [#3783](https://github.com/tobymao/sqlglot/issues/3783) opened by [@EdgyEdgemond](https://github.com/EdgyEdgemond)*
- [`00722eb`](https://github.com/tobymao/sqlglot/commit/00722eb41795e7454d0ecb4c3d0e1caf96a19465) - Move ANNOTATORS to Dialect for dialect-aware annotation *(PR [#3786](https://github.com/tobymao/sqlglot/pull/3786) by [@VaggelisD](https://github.com/VaggelisD))*
- :arrow_lower_right: *addresses issue [#3778](https://github.com/tobymao/sqlglot/issues/3778) opened by [@ddelzell](https://github.com/ddelzell)*
- [`a6d84fb`](https://github.com/tobymao/sqlglot/commit/a6d84fbd9b4120f42b31bb01d4bf3e6258e51562) - **postgres**: Parse TO_DATE as exp.StrToDate *(PR [#3799](https://github.com/tobymao/sqlglot/pull/3799) by [@VaggelisD](https://github.com/VaggelisD))*
- :arrow_lower_right: *addresses issue [#3797](https://github.com/tobymao/sqlglot/issues/3797) opened by [@dioptre](https://github.com/dioptre)*
- [`3582644`](https://github.com/tobymao/sqlglot/commit/358264478e5449b7e4ebddce1cc463d140f266f5) - **hive, spark, db**: Support for exp.GenerateSeries *(PR [#3798](https://github.com/tobymao/sqlglot/pull/3798) by [@VaggelisD](https://github.com/VaggelisD))*
- :arrow_lower_right: *addresses issue [#3793](https://github.com/tobymao/sqlglot/issues/3793) opened by [@wojciechowski-p](https://github.com/wojciechowski-p)*
- [`80b4a12`](https://github.com/tobymao/sqlglot/commit/80b4a12b779b661e42d31cf75ead8aff25257f8a) - **tsql**: Support for COLUMNSTORE option on CREATE INDEX *(PR [#3805](https://github.com/tobymao/sqlglot/pull/3805) by [@VaggelisD](https://github.com/VaggelisD))*
- :arrow_lower_right: *addresses issue [#3801](https://github.com/tobymao/sqlglot/issues/3801) opened by [@na399](https://github.com/na399)*
- [`bf6c126`](https://github.com/tobymao/sqlglot/commit/bf6c12687f3ed032ea7be40875c19fc00e5927ad) - **databricks**: Support USE CATALOG *(PR [#3812](https://github.com/tobymao/sqlglot/pull/3812) by [@VaggelisD](https://github.com/VaggelisD))*
- :arrow_lower_right: *addresses issue [#3811](https://github.com/tobymao/sqlglot/issues/3811) opened by [@grusin-db](https://github.com/grusin-db)*
- [`624d411`](https://github.com/tobymao/sqlglot/commit/624d4115e3ee4b8db2dbf2970bf0047e14b23e92) - **snowflake**: Support for OBJECT_INSERT, transpile to DDB *(PR [#3807](https://github.com/tobymao/sqlglot/pull/3807) by [@VaggelisD](https://github.com/VaggelisD))*
- :arrow_lower_right: *addresses issue [#3802](https://github.com/tobymao/sqlglot/issues/3802) opened by [@buremba](https://github.com/buremba)*
- [`5b393fb`](https://github.com/tobymao/sqlglot/commit/5b393fb4d2db47b9229ca12a03aba82cdd510615) - **postgres**: Add missing constraint options *(PR [#3816](https://github.com/tobymao/sqlglot/pull/3816) by [@VaggelisD](https://github.com/VaggelisD))*
- :arrow_lower_right: *addresses issue [#3814](https://github.com/tobymao/sqlglot/issues/3814) opened by [@DTovstohan](https://github.com/DTovstohan)*
### :bug: Bug Fixes
- [`898f523`](https://github.com/tobymao/sqlglot/commit/898f523a8db9f73b59055f1e38cf4acb07157f00) - **duckdb**: Wrap JSON_EXTRACT if it's subscripted *(PR [#3785](https://github.com/tobymao/sqlglot/pull/3785) by [@VaggelisD](https://github.com/VaggelisD))*
- :arrow_lower_right: *fixes issue [#3782](https://github.com/tobymao/sqlglot/issues/3782) opened by [@egan8888](https://github.com/egan8888)*
- [`db3748d`](https://github.com/tobymao/sqlglot/commit/db3748d56b138a6427d6f4fc3e32c895ffb993fa) - **mysql**: don't wrap VALUES clause *(PR [#3792](https://github.com/tobymao/sqlglot/pull/3792) by [@georgesittas](https://github.com/georgesittas))*
- :arrow_lower_right: *fixes issue [#3789](https://github.com/tobymao/sqlglot/issues/3789) opened by [@stephenprater](https://github.com/stephenprater)*
- [`44d6506`](https://github.com/tobymao/sqlglot/commit/44d650637d5d7a662b57ec1d8ca74dffe0f7ad73) - with as comments closes [#3794](https://github.com/tobymao/sqlglot/pull/3794) *(commit by [@tobymao](https://github.com/tobymao))*
- [`8ca6a61`](https://github.com/tobymao/sqlglot/commit/8ca6a613692e7339717c449ba6966d7c2911b584) - **tsql**: Fix roundtrip of exp.Stddev *(PR [#3806](https://github.com/tobymao/sqlglot/pull/3806) by [@VaggelisD](https://github.com/VaggelisD))*
- :arrow_lower_right: *fixes issue [#3804](https://github.com/tobymao/sqlglot/issues/3804) opened by [@JonaGeishauser](https://github.com/JonaGeishauser)*
- [`8551063`](https://github.com/tobymao/sqlglot/commit/855106377c97ee313b45046041fafabb2810dab2) - **duckdb**: Fix STRUCT_PACK -> ROW due to is_struct_cast *(PR [#3809](https://github.com/tobymao/sqlglot/pull/3809) by [@VaggelisD](https://github.com/VaggelisD))*
- :arrow_lower_right: *fixes issue [#3808](https://github.com/tobymao/sqlglot/issues/3808) opened by [@aersam](https://github.com/aersam)*
- [`98f80ed`](https://github.com/tobymao/sqlglot/commit/98f80eda3863b5ff40d566330e6ab35a99f569ca) - **clickhouse**: allow like as an identifier closes [#3813](https://github.com/tobymao/sqlglot/pull/3813) *(commit by [@tobymao](https://github.com/tobymao))*
- [`556ba35`](https://github.com/tobymao/sqlglot/commit/556ba35e4ce9efa51561ef0578bfb24a51ce4dcd) - allow parse_identifier to handle single quotes *(commit by [@tobymao](https://github.com/tobymao))*
- [`f9810d2`](https://github.com/tobymao/sqlglot/commit/f9810d213f3992881fc13291a681da6553701083) - **snowflake**: Don't consume LPAREN when parsing staged file path *(PR [#3815](https://github.com/tobymao/sqlglot/pull/3815) by [@VaggelisD](https://github.com/VaggelisD))*
- [`416f4a1`](https://github.com/tobymao/sqlglot/commit/416f4a1b6a04b858ff8ed94509aacd9bacca145b) - **postgres**: Fix COLLATE column constraint *(PR [#3820](https://github.com/tobymao/sqlglot/pull/3820) by [@VaggelisD](https://github.com/VaggelisD))*
- :arrow_lower_right: *fixes issue [#3817](https://github.com/tobymao/sqlglot/issues/3817) opened by [@DTovstohan](https://github.com/DTovstohan)*
- [`69b9395`](https://github.com/tobymao/sqlglot/commit/69b93953c35bd7f1d53cf15d9937117edb38f512) - Do not preemptively consume SELECT [ALL] if ALL is connected *(PR [#3822](https://github.com/tobymao/sqlglot/pull/3822) by [@VaggelisD](https://github.com/VaggelisD))*
- :arrow_lower_right: *fixes issue [#3819](https://github.com/tobymao/sqlglot/issues/3819) opened by [@nfx](https://github.com/nfx)*
- [`1c19abe`](https://github.com/tobymao/sqlglot/commit/1c19abe5b3f3187a2e0ba420cf8c5e5b5ecc788e) - **presto, trino**: Fix StrToUnix transpilation *(PR [#3824](https://github.com/tobymao/sqlglot/pull/3824) by [@VaggelisD](https://github.com/VaggelisD))*
- :arrow_lower_right: *fixes issue [#3796](https://github.com/tobymao/sqlglot/issues/3796) opened by [@ddelzell](https://github.com/ddelzell)*
## [v25.6.1] - 2024-07-18
### :bug: Bug Fixes
- [`19370d5`](https://github.com/tobymao/sqlglot/commit/19370d5d16b555e25def503323ec3dc4e5d40e6c) - **postgres**: Decouple UNIQUE from DEFAULT constraints *(PR [#3775](https://github.com/tobymao/sqlglot/pull/3775) by [@VaggelisD](https://github.com/VaggelisD))*
- :arrow_lower_right: *fixes issue [#3774](https://github.com/tobymao/sqlglot/issues/3774) opened by [@EdgyEdgemond](https://github.com/EdgyEdgemond)*
- [`e99146b`](https://github.com/tobymao/sqlglot/commit/e99146b0989599772c020905f69496ea80e7e2e5) - make copy a dml statement for qualify_tables *(commit by [@tobymao](https://github.com/tobymao))*
## [v25.6.0] - 2024-07-17 ## [v25.6.0] - 2024-07-17
### :boom: BREAKING CHANGES ### :boom: BREAKING CHANGES
- due to [`89fc63c`](https://github.com/tobymao/sqlglot/commit/89fc63c5831dc5d63feff9e39fea1e90d65e9a09) - QUALIFY comes after WINDOW clause in queries *(PR [#3745](https://github.com/tobymao/sqlglot/pull/3745) by [@georgesittas](https://github.com/georgesittas))*: - due to [`89fc63c`](https://github.com/tobymao/sqlglot/commit/89fc63c5831dc5d63feff9e39fea1e90d65e9a09) - QUALIFY comes after WINDOW clause in queries *(PR [#3745](https://github.com/tobymao/sqlglot/pull/3745) by [@georgesittas](https://github.com/georgesittas))*:
@ -4158,3 +4206,5 @@ Changelog
[v25.5.0]: https://github.com/tobymao/sqlglot/compare/v25.4.1...v25.5.0 [v25.5.0]: https://github.com/tobymao/sqlglot/compare/v25.4.1...v25.5.0
[v25.5.1]: https://github.com/tobymao/sqlglot/compare/v25.5.0...v25.5.1 [v25.5.1]: https://github.com/tobymao/sqlglot/compare/v25.5.0...v25.5.1
[v25.6.0]: https://github.com/tobymao/sqlglot/compare/v25.5.1...v25.6.0 [v25.6.0]: https://github.com/tobymao/sqlglot/compare/v25.5.1...v25.6.0
[v25.6.1]: https://github.com/tobymao/sqlglot/compare/v25.6.0...v25.6.1
[v25.7.0]: https://github.com/tobymao/sqlglot/compare/v25.6.1...v25.7.0

File diff suppressed because one or more lines are too long

File diff suppressed because one or more lines are too long

File diff suppressed because one or more lines are too long

File diff suppressed because one or more lines are too long

File diff suppressed because one or more lines are too long

File diff suppressed because one or more lines are too long

File diff suppressed because one or more lines are too long

File diff suppressed because one or more lines are too long

File diff suppressed because one or more lines are too long

File diff suppressed because one or more lines are too long

File diff suppressed because one or more lines are too long

File diff suppressed because one or more lines are too long

File diff suppressed because one or more lines are too long

File diff suppressed because one or more lines are too long

File diff suppressed because one or more lines are too long

File diff suppressed because one or more lines are too long

File diff suppressed because one or more lines are too long

File diff suppressed because one or more lines are too long

File diff suppressed because one or more lines are too long

File diff suppressed because one or more lines are too long

File diff suppressed because one or more lines are too long

File diff suppressed because one or more lines are too long

File diff suppressed because one or more lines are too long

File diff suppressed because one or more lines are too long

File diff suppressed because one or more lines are too long

File diff suppressed because one or more lines are too long

File diff suppressed because one or more lines are too long

File diff suppressed because one or more lines are too long

File diff suppressed because one or more lines are too long

File diff suppressed because one or more lines are too long

File diff suppressed because one or more lines are too long

File diff suppressed because one or more lines are too long

File diff suppressed because one or more lines are too long

File diff suppressed because one or more lines are too long

File diff suppressed because one or more lines are too long

File diff suppressed because one or more lines are too long

File diff suppressed because one or more lines are too long

File diff suppressed because one or more lines are too long

File diff suppressed because one or more lines are too long

File diff suppressed because one or more lines are too long

File diff suppressed because one or more lines are too long

File diff suppressed because one or more lines are too long

File diff suppressed because one or more lines are too long

File diff suppressed because one or more lines are too long

File diff suppressed because one or more lines are too long

File diff suppressed because one or more lines are too long

File diff suppressed because one or more lines are too long

File diff suppressed because one or more lines are too long

File diff suppressed because one or more lines are too long

File diff suppressed because one or more lines are too long

File diff suppressed because one or more lines are too long

File diff suppressed because one or more lines are too long

File diff suppressed because one or more lines are too long

File diff suppressed because one or more lines are too long

File diff suppressed because one or more lines are too long

File diff suppressed because one or more lines are too long

File diff suppressed because one or more lines are too long

File diff suppressed because one or more lines are too long

File diff suppressed because one or more lines are too long

File diff suppressed because one or more lines are too long

File diff suppressed because one or more lines are too long

File diff suppressed because one or more lines are too long

File diff suppressed because one or more lines are too long

File diff suppressed because one or more lines are too long

File diff suppressed because one or more lines are too long

File diff suppressed because one or more lines are too long

File diff suppressed because one or more lines are too long

File diff suppressed because one or more lines are too long

File diff suppressed because one or more lines are too long

File diff suppressed because one or more lines are too long

File diff suppressed because one or more lines are too long

View file

@ -334,6 +334,11 @@ class ClickHouse(Dialect):
RESERVED_TOKENS = parser.Parser.RESERVED_TOKENS - {TokenType.SELECT} RESERVED_TOKENS = parser.Parser.RESERVED_TOKENS - {TokenType.SELECT}
ID_VAR_TOKENS = {
*parser.Parser.ID_VAR_TOKENS,
TokenType.LIKE,
}
AGG_FUNC_MAPPING = ( AGG_FUNC_MAPPING = (
lambda functions, suffixes: { lambda functions, suffixes: {
f"{f}{sfx}": (f, sfx) for sfx in (suffixes + [""]) for f in functions f"{f}{sfx}": (f, sfx) for sfx in (suffixes + [""]) for f in functions

View file

@ -8,7 +8,7 @@ from functools import reduce
from sqlglot import exp from sqlglot import exp
from sqlglot.errors import ParseError from sqlglot.errors import ParseError
from sqlglot.generator import Generator from sqlglot.generator import Generator
from sqlglot.helper import AutoName, flatten, is_int, seq_get from sqlglot.helper import AutoName, flatten, is_int, seq_get, subclasses
from sqlglot.jsonpath import JSONPathTokenizer, parse as parse_json_path from sqlglot.jsonpath import JSONPathTokenizer, parse as parse_json_path
from sqlglot.parser import Parser from sqlglot.parser import Parser
from sqlglot.time import TIMEZONES, format_time from sqlglot.time import TIMEZONES, format_time
@ -23,6 +23,10 @@ JSON_EXTRACT_TYPE = t.Union[exp.JSONExtract, exp.JSONExtractScalar]
if t.TYPE_CHECKING: if t.TYPE_CHECKING:
from sqlglot._typing import B, E, F from sqlglot._typing import B, E, F
from sqlglot.optimizer.annotate_types import TypeAnnotator
AnnotatorsType = t.Dict[t.Type[E], t.Callable[[TypeAnnotator, E], E]]
logger = logging.getLogger("sqlglot") logger = logging.getLogger("sqlglot")
UNESCAPED_SEQUENCES = { UNESCAPED_SEQUENCES = {
@ -37,6 +41,10 @@ UNESCAPED_SEQUENCES = {
} }
def _annotate_with_type_lambda(data_type: exp.DataType.Type) -> t.Callable[[TypeAnnotator, E], E]:
return lambda self, e: self._annotate_with_type(e, data_type)
class Dialects(str, Enum): class Dialects(str, Enum):
"""Dialects supported by SQLGLot.""" """Dialects supported by SQLGLot."""
@ -489,6 +497,167 @@ class Dialect(metaclass=_Dialect):
"CENTURIES": "CENTURY", "CENTURIES": "CENTURY",
} }
TYPE_TO_EXPRESSIONS: t.Dict[exp.DataType.Type, t.Set[t.Type[exp.Expression]]] = {
exp.DataType.Type.BIGINT: {
exp.ApproxDistinct,
exp.ArraySize,
exp.Count,
exp.Length,
},
exp.DataType.Type.BOOLEAN: {
exp.Between,
exp.Boolean,
exp.In,
exp.RegexpLike,
},
exp.DataType.Type.DATE: {
exp.CurrentDate,
exp.Date,
exp.DateFromParts,
exp.DateStrToDate,
exp.DiToDate,
exp.StrToDate,
exp.TimeStrToDate,
exp.TsOrDsToDate,
},
exp.DataType.Type.DATETIME: {
exp.CurrentDatetime,
exp.Datetime,
exp.DatetimeAdd,
exp.DatetimeSub,
},
exp.DataType.Type.DOUBLE: {
exp.ApproxQuantile,
exp.Avg,
exp.Div,
exp.Exp,
exp.Ln,
exp.Log,
exp.Pow,
exp.Quantile,
exp.Round,
exp.SafeDivide,
exp.Sqrt,
exp.Stddev,
exp.StddevPop,
exp.StddevSamp,
exp.Variance,
exp.VariancePop,
},
exp.DataType.Type.INT: {
exp.Ceil,
exp.DatetimeDiff,
exp.DateDiff,
exp.TimestampDiff,
exp.TimeDiff,
exp.DateToDi,
exp.Levenshtein,
exp.Sign,
exp.StrPosition,
exp.TsOrDiToDi,
},
exp.DataType.Type.JSON: {
exp.ParseJSON,
},
exp.DataType.Type.TIME: {
exp.Time,
},
exp.DataType.Type.TIMESTAMP: {
exp.CurrentTime,
exp.CurrentTimestamp,
exp.StrToTime,
exp.TimeAdd,
exp.TimeStrToTime,
exp.TimeSub,
exp.TimestampAdd,
exp.TimestampSub,
exp.UnixToTime,
},
exp.DataType.Type.TINYINT: {
exp.Day,
exp.Month,
exp.Week,
exp.Year,
exp.Quarter,
},
exp.DataType.Type.VARCHAR: {
exp.ArrayConcat,
exp.Concat,
exp.ConcatWs,
exp.DateToDateStr,
exp.GroupConcat,
exp.Initcap,
exp.Lower,
exp.Substring,
exp.TimeToStr,
exp.TimeToTimeStr,
exp.Trim,
exp.TsOrDsToDateStr,
exp.UnixToStr,
exp.UnixToTimeStr,
exp.Upper,
},
}
ANNOTATORS: AnnotatorsType = {
**{
expr_type: lambda self, e: self._annotate_unary(e)
for expr_type in subclasses(exp.__name__, (exp.Unary, exp.Alias))
},
**{
expr_type: lambda self, e: self._annotate_binary(e)
for expr_type in subclasses(exp.__name__, exp.Binary)
},
**{
expr_type: _annotate_with_type_lambda(data_type)
for data_type, expressions in TYPE_TO_EXPRESSIONS.items()
for expr_type in expressions
},
exp.Abs: lambda self, e: self._annotate_by_args(e, "this"),
exp.Anonymous: lambda self, e: self._annotate_with_type(e, exp.DataType.Type.UNKNOWN),
exp.Array: lambda self, e: self._annotate_by_args(e, "expressions", array=True),
exp.ArrayAgg: lambda self, e: self._annotate_by_args(e, "this", array=True),
exp.ArrayConcat: lambda self, e: self._annotate_by_args(e, "this", "expressions"),
exp.Bracket: lambda self, e: self._annotate_bracket(e),
exp.Cast: lambda self, e: self._annotate_with_type(e, e.args["to"]),
exp.Case: lambda self, e: self._annotate_by_args(e, "default", "ifs"),
exp.Coalesce: lambda self, e: self._annotate_by_args(e, "this", "expressions"),
exp.DataType: lambda self, e: self._annotate_with_type(e, e.copy()),
exp.DateAdd: lambda self, e: self._annotate_timeunit(e),
exp.DateSub: lambda self, e: self._annotate_timeunit(e),
exp.DateTrunc: lambda self, e: self._annotate_timeunit(e),
exp.Distinct: lambda self, e: self._annotate_by_args(e, "expressions"),
exp.Div: lambda self, e: self._annotate_div(e),
exp.Dot: lambda self, e: self._annotate_dot(e),
exp.Explode: lambda self, e: self._annotate_explode(e),
exp.Extract: lambda self, e: self._annotate_extract(e),
exp.Filter: lambda self, e: self._annotate_by_args(e, "this"),
exp.GenerateDateArray: lambda self, e: self._annotate_with_type(
e, exp.DataType.build("ARRAY<DATE>")
),
exp.If: lambda self, e: self._annotate_by_args(e, "true", "false"),
exp.Interval: lambda self, e: self._annotate_with_type(e, exp.DataType.Type.INTERVAL),
exp.Least: lambda self, e: self._annotate_by_args(e, "expressions"),
exp.Literal: lambda self, e: self._annotate_literal(e),
exp.Map: lambda self, e: self._annotate_map(e),
exp.Max: lambda self, e: self._annotate_by_args(e, "this", "expressions"),
exp.Min: lambda self, e: self._annotate_by_args(e, "this", "expressions"),
exp.Null: lambda self, e: self._annotate_with_type(e, exp.DataType.Type.NULL),
exp.Nullif: lambda self, e: self._annotate_by_args(e, "this", "expression"),
exp.PropertyEQ: lambda self, e: self._annotate_by_args(e, "expression"),
exp.Slice: lambda self, e: self._annotate_with_type(e, exp.DataType.Type.UNKNOWN),
exp.Struct: lambda self, e: self._annotate_struct(e),
exp.Sum: lambda self, e: self._annotate_by_args(e, "this", "expressions", promote=True),
exp.Timestamp: lambda self, e: self._annotate_with_type(
e,
exp.DataType.Type.TIMESTAMPTZ if e.args.get("with_tz") else exp.DataType.Type.TIMESTAMP,
),
exp.ToMap: lambda self, e: self._annotate_to_map(e),
exp.TryCast: lambda self, e: self._annotate_with_type(e, e.args["to"]),
exp.Unnest: lambda self, e: self._annotate_unnest(e),
exp.VarMap: lambda self, e: self._annotate_map(e),
}
@classmethod @classmethod
def get_or_raise(cls, dialect: DialectType) -> Dialect: def get_or_raise(cls, dialect: DialectType) -> Dialect:
""" """
@ -1419,3 +1588,24 @@ def build_timestamp_from_parts(args: t.List) -> exp.Func:
def sha256_sql(self: Generator, expression: exp.SHA2) -> str: def sha256_sql(self: Generator, expression: exp.SHA2) -> str:
return self.func(f"SHA{expression.text('length') or '256'}", expression.this) return self.func(f"SHA{expression.text('length') or '256'}", expression.this)
def sequence_sql(self: Generator, expression: exp.GenerateSeries):
start = expression.args["start"]
end = expression.args["end"]
step = expression.args.get("step")
if isinstance(start, exp.Cast):
target_type = start.to
elif isinstance(end, exp.Cast):
target_type = end.to
else:
target_type = None
if target_type and target_type.is_type("timestamp"):
if target_type is start.to:
end = exp.cast(end, target_type)
else:
start = exp.cast(start, target_type)
return self.func("SEQUENCE", start, end, step)

View file

@ -3,6 +3,7 @@ from __future__ import annotations
import typing as t import typing as t
from sqlglot import exp, generator, parser, tokens, transforms from sqlglot import exp, generator, parser, tokens, transforms
from sqlglot.expressions import DATA_TYPE
from sqlglot.dialects.dialect import ( from sqlglot.dialects.dialect import (
Dialect, Dialect,
JSON_EXTRACT_TYPE, JSON_EXTRACT_TYPE,
@ -35,20 +36,34 @@ from sqlglot.dialects.dialect import (
from sqlglot.helper import seq_get from sqlglot.helper import seq_get
from sqlglot.tokens import TokenType from sqlglot.tokens import TokenType
DATETIME_DELTA = t.Union[
def _ts_or_ds_add_sql(self: DuckDB.Generator, expression: exp.TsOrDsAdd) -> str: exp.DateAdd, exp.TimeAdd, exp.DatetimeAdd, exp.TsOrDsAdd, exp.DateSub, exp.DatetimeSub
this = self.sql(expression, "this") ]
interval = self.sql(exp.Interval(this=expression.expression, unit=unit_to_var(expression)))
return f"CAST({this} AS {self.sql(expression.return_type)}) + {interval}"
def _date_delta_sql( def _date_delta_sql(self: DuckDB.Generator, expression: DATETIME_DELTA) -> str:
self: DuckDB.Generator, expression: exp.DateAdd | exp.DateSub | exp.TimeAdd this = expression.this
) -> str:
this = self.sql(expression, "this")
unit = unit_to_var(expression) unit = unit_to_var(expression)
op = "+" if isinstance(expression, (exp.DateAdd, exp.TimeAdd)) else "-" op = (
return f"{this} {op} {self.sql(exp.Interval(this=expression.expression, unit=unit))}" "+"
if isinstance(expression, (exp.DateAdd, exp.TimeAdd, exp.DatetimeAdd, exp.TsOrDsAdd))
else "-"
)
to_type: t.Optional[DATA_TYPE] = None
if isinstance(expression, exp.TsOrDsAdd):
to_type = expression.return_type
elif this.is_string:
# Cast string literals (i.e function parameters) to the appropriate type for +/- interval to work
to_type = (
exp.DataType.Type.DATETIME
if isinstance(expression, (exp.DatetimeAdd, exp.DatetimeSub))
else exp.DataType.Type.DATE
)
this = exp.cast(this, to_type) if to_type else this
return f"{self.sql(this)} {op} {self.sql(exp.Interval(this=expression.expression, unit=unit))}"
# BigQuery -> DuckDB conversion for the DATE function # BigQuery -> DuckDB conversion for the DATE function
@ -119,7 +134,12 @@ def _struct_sql(self: DuckDB.Generator, expression: exp.Struct) -> str:
# BigQuery allows inline construction such as "STRUCT<a STRING, b INTEGER>('str', 1)" which is # BigQuery allows inline construction such as "STRUCT<a STRING, b INTEGER>('str', 1)" which is
# canonicalized to "ROW('str', 1) AS STRUCT(a TEXT, b INT)" in DuckDB # canonicalized to "ROW('str', 1) AS STRUCT(a TEXT, b INT)" in DuckDB
is_struct_cast = expression.find_ancestor(exp.Cast) # The transformation to ROW will take place if a cast to STRUCT / ARRAY of STRUCTs is found
ancestor_cast = expression.find_ancestor(exp.Cast)
is_struct_cast = ancestor_cast and any(
casted_type.is_type(exp.DataType.Type.STRUCT)
for casted_type in ancestor_cast.find_all(exp.DataType)
)
for i, expr in enumerate(expression.expressions): for i, expr in enumerate(expression.expressions):
is_property_eq = isinstance(expr, exp.PropertyEQ) is_property_eq = isinstance(expr, exp.PropertyEQ)
@ -168,7 +188,7 @@ def _unix_to_time_sql(self: DuckDB.Generator, expression: exp.UnixToTime) -> str
def _arrow_json_extract_sql(self: DuckDB.Generator, expression: JSON_EXTRACT_TYPE) -> str: def _arrow_json_extract_sql(self: DuckDB.Generator, expression: JSON_EXTRACT_TYPE) -> str:
arrow_sql = arrow_json_extract_sql(self, expression) arrow_sql = arrow_json_extract_sql(self, expression)
if not expression.same_parent and isinstance(expression.parent, exp.Binary): if not expression.same_parent and isinstance(expression.parent, (exp.Binary, exp.Bracket)):
arrow_sql = self.wrap(arrow_sql) arrow_sql = self.wrap(arrow_sql)
return arrow_sql return arrow_sql
@ -420,6 +440,8 @@ class DuckDB(Dialect):
), ),
exp.DateStrToDate: datestrtodate_sql, exp.DateStrToDate: datestrtodate_sql,
exp.Datetime: no_datetime_sql, exp.Datetime: no_datetime_sql,
exp.DatetimeSub: _date_delta_sql,
exp.DatetimeAdd: _date_delta_sql,
exp.DateToDi: lambda self, exp.DateToDi: lambda self,
e: f"CAST(STRFTIME({self.sql(e, 'this')}, {DuckDB.DATEINT_FORMAT}) AS INT)", e: f"CAST(STRFTIME({self.sql(e, 'this')}, {DuckDB.DATEINT_FORMAT}) AS INT)",
exp.Decode: lambda self, e: encode_decode_sql(self, e, "DECODE", replace=False), exp.Decode: lambda self, e: encode_decode_sql(self, e, "DECODE", replace=False),
@ -484,7 +506,7 @@ class DuckDB(Dialect):
exp.TimeToUnix: rename_func("EPOCH"), exp.TimeToUnix: rename_func("EPOCH"),
exp.TsOrDiToDi: lambda self, exp.TsOrDiToDi: lambda self,
e: f"CAST(SUBSTR(REPLACE(CAST({self.sql(e, 'this')} AS TEXT), '-', ''), 1, 8) AS INT)", e: f"CAST(SUBSTR(REPLACE(CAST({self.sql(e, 'this')} AS TEXT), '-', ''), 1, 8) AS INT)",
exp.TsOrDsAdd: _ts_or_ds_add_sql, exp.TsOrDsAdd: _date_delta_sql,
exp.TsOrDsDiff: lambda self, e: self.func( exp.TsOrDsDiff: lambda self, e: self.func(
"DATE_DIFF", "DATE_DIFF",
f"'{e.args.get('unit') or 'DAY'}'", f"'{e.args.get('unit') or 'DAY'}'",
@ -790,3 +812,18 @@ class DuckDB(Dialect):
) )
return self.sql(case) return self.sql(case)
def objectinsert_sql(self, expression: exp.ObjectInsert) -> str:
this = expression.this
key = expression.args.get("key")
key_sql = key.name if isinstance(key, exp.Expression) else ""
value_sql = self.sql(expression, "value")
kv_sql = f"{key_sql} := {value_sql}"
# If the input struct is empty e.g. transpiling OBJECT_INSERT(OBJECT_CONSTRUCT(), key, value) from Snowflake
# then we can generate STRUCT_PACK which will build it since STRUCT_INSERT({}, key := value) is not valid DuckDB
if isinstance(this, exp.Struct) and not this.expressions:
return self.func("STRUCT_PACK", kv_sql)
return self.func("STRUCT_INSERT", this, kv_sql)

View file

@ -31,6 +31,7 @@ from sqlglot.dialects.dialect import (
timestrtotime_sql, timestrtotime_sql,
unit_to_str, unit_to_str,
var_map_sql, var_map_sql,
sequence_sql,
) )
from sqlglot.transforms import ( from sqlglot.transforms import (
remove_unique_constraints, remove_unique_constraints,
@ -310,6 +311,7 @@ class Hive(Dialect):
"REGEXP_EXTRACT": lambda args: exp.RegexpExtract( "REGEXP_EXTRACT": lambda args: exp.RegexpExtract(
this=seq_get(args, 0), expression=seq_get(args, 1), group=seq_get(args, 2) this=seq_get(args, 0), expression=seq_get(args, 1), group=seq_get(args, 2)
), ),
"SEQUENCE": exp.GenerateSeries.from_arg_list,
"SIZE": exp.ArraySize.from_arg_list, "SIZE": exp.ArraySize.from_arg_list,
"SPLIT": exp.RegexpSplit.from_arg_list, "SPLIT": exp.RegexpSplit.from_arg_list,
"STR_TO_MAP": lambda args: exp.StrToMap( "STR_TO_MAP": lambda args: exp.StrToMap(
@ -506,6 +508,7 @@ class Hive(Dialect):
exp.FileFormatProperty: lambda self, exp.FileFormatProperty: lambda self,
e: f"STORED AS {self.sql(e, 'this') if isinstance(e.this, exp.InputOutputFormat) else e.name.upper()}", e: f"STORED AS {self.sql(e, 'this') if isinstance(e.this, exp.InputOutputFormat) else e.name.upper()}",
exp.FromBase64: rename_func("UNBASE64"), exp.FromBase64: rename_func("UNBASE64"),
exp.GenerateSeries: sequence_sql,
exp.If: if_sql(), exp.If: if_sql(),
exp.ILike: no_ilike_sql, exp.ILike: no_ilike_sql,
exp.IsNan: rename_func("ISNAN"), exp.IsNan: rename_func("ISNAN"),

View file

@ -691,6 +691,7 @@ class MySQL(Dialect):
SUPPORTS_TO_NUMBER = False SUPPORTS_TO_NUMBER = False
PARSE_JSON_NAME = None PARSE_JSON_NAME = None
PAD_FILL_PATTERN_IS_REQUIRED = True PAD_FILL_PATTERN_IS_REQUIRED = True
WRAP_DERIVED_VALUES = False
TRANSFORMS = { TRANSFORMS = {
**generator.Generator.TRANSFORMS, **generator.Generator.TRANSFORMS,

View file

@ -365,6 +365,7 @@ class Postgres(Dialect):
"NOW": exp.CurrentTimestamp.from_arg_list, "NOW": exp.CurrentTimestamp.from_arg_list,
"REGEXP_REPLACE": _build_regexp_replace, "REGEXP_REPLACE": _build_regexp_replace,
"TO_CHAR": build_formatted_time(exp.TimeToStr, "postgres"), "TO_CHAR": build_formatted_time(exp.TimeToStr, "postgres"),
"TO_DATE": build_formatted_time(exp.StrToDate, "postgres"),
"TO_TIMESTAMP": _build_to_timestamp, "TO_TIMESTAMP": _build_to_timestamp,
"UNNEST": exp.Explode.from_arg_list, "UNNEST": exp.Explode.from_arg_list,
"SHA256": lambda args: exp.SHA2(this=seq_get(args, 0), length=exp.Literal.number(256)), "SHA256": lambda args: exp.SHA2(this=seq_get(args, 0), length=exp.Literal.number(256)),

View file

@ -28,6 +28,7 @@ from sqlglot.dialects.dialect import (
timestrtotime_sql, timestrtotime_sql,
ts_or_ds_add_cast, ts_or_ds_add_cast,
unit_to_str, unit_to_str,
sequence_sql,
) )
from sqlglot.dialects.hive import Hive from sqlglot.dialects.hive import Hive
from sqlglot.dialects.mysql import MySQL from sqlglot.dialects.mysql import MySQL
@ -204,11 +205,11 @@ def _jsonextract_sql(self: Presto.Generator, expression: exp.JSONExtract) -> str
return f"{this}{expr}" return f"{this}{expr}"
def _to_int(expression: exp.Expression) -> exp.Expression: def _to_int(self: Presto.Generator, expression: exp.Expression) -> exp.Expression:
if not expression.type: if not expression.type:
from sqlglot.optimizer.annotate_types import annotate_types from sqlglot.optimizer.annotate_types import annotate_types
annotate_types(expression) annotate_types(expression, dialect=self.dialect)
if expression.type and expression.type.this not in exp.DataType.INTEGER_TYPES: if expression.type and expression.type.this not in exp.DataType.INTEGER_TYPES:
return exp.cast(expression, to=exp.DataType.Type.BIGINT) return exp.cast(expression, to=exp.DataType.Type.BIGINT)
return expression return expression
@ -229,7 +230,7 @@ def _date_delta_sql(
name: str, negate_interval: bool = False name: str, negate_interval: bool = False
) -> t.Callable[[Presto.Generator, DATE_ADD_OR_SUB], str]: ) -> t.Callable[[Presto.Generator, DATE_ADD_OR_SUB], str]:
def _delta_sql(self: Presto.Generator, expression: DATE_ADD_OR_SUB) -> str: def _delta_sql(self: Presto.Generator, expression: DATE_ADD_OR_SUB) -> str:
interval = _to_int(expression.expression) interval = _to_int(self, expression.expression)
return self.func( return self.func(
name, name,
unit_to_str(expression), unit_to_str(expression),
@ -256,6 +257,21 @@ class Presto(Dialect):
# https://github.com/prestodb/presto/issues/2863 # https://github.com/prestodb/presto/issues/2863
NORMALIZATION_STRATEGY = NormalizationStrategy.CASE_INSENSITIVE NORMALIZATION_STRATEGY = NormalizationStrategy.CASE_INSENSITIVE
# The result of certain math functions in Presto/Trino is of type
# equal to the input type e.g: FLOOR(5.5/2) -> DECIMAL, FLOOR(5/2) -> BIGINT
ANNOTATORS = {
**Dialect.ANNOTATORS,
exp.Floor: lambda self, e: self._annotate_by_args(e, "this"),
exp.Ceil: lambda self, e: self._annotate_by_args(e, "this"),
exp.Mod: lambda self, e: self._annotate_by_args(e, "this", "expression"),
exp.Round: lambda self, e: self._annotate_by_args(e, "this"),
exp.Sign: lambda self, e: self._annotate_by_args(e, "this"),
exp.Abs: lambda self, e: self._annotate_by_args(e, "this"),
exp.Rand: lambda self, e: self._annotate_by_args(e, "this")
if e.this
else self._set_type(e, exp.DataType.Type.DOUBLE),
}
class Tokenizer(tokens.Tokenizer): class Tokenizer(tokens.Tokenizer):
UNICODE_STRINGS = [ UNICODE_STRINGS = [
(prefix + q, q) (prefix + q, q)
@ -420,6 +436,7 @@ class Presto(Dialect):
exp.FirstValue: _first_last_sql, exp.FirstValue: _first_last_sql,
exp.FromTimeZone: lambda self, exp.FromTimeZone: lambda self,
e: f"WITH_TIMEZONE({self.sql(e, 'this')}, {self.sql(e, 'zone')}) AT TIME ZONE 'UTC'", e: f"WITH_TIMEZONE({self.sql(e, 'this')}, {self.sql(e, 'zone')}) AT TIME ZONE 'UTC'",
exp.GenerateSeries: sequence_sql,
exp.Group: transforms.preprocess([transforms.unalias_group]), exp.Group: transforms.preprocess([transforms.unalias_group]),
exp.GroupConcat: lambda self, e: self.func( exp.GroupConcat: lambda self, e: self.func(
"ARRAY_JOIN", self.func("ARRAY_AGG", e.this), e.args.get("separator") "ARRAY_JOIN", self.func("ARRAY_AGG", e.this), e.args.get("separator")
@ -572,11 +589,20 @@ class Presto(Dialect):
# timezone involved, we wrap it in a `TRY` call and use `PARSE_DATETIME` as a fallback, # timezone involved, we wrap it in a `TRY` call and use `PARSE_DATETIME` as a fallback,
# which seems to be using the same time mapping as Hive, as per: # which seems to be using the same time mapping as Hive, as per:
# https://joda-time.sourceforge.net/apidocs/org/joda/time/format/DateTimeFormat.html # https://joda-time.sourceforge.net/apidocs/org/joda/time/format/DateTimeFormat.html
value_as_text = exp.cast(expression.this, exp.DataType.Type.TEXT) this = expression.this
value_as_text = exp.cast(this, exp.DataType.Type.TEXT)
value_as_timestamp = (
exp.cast(this, exp.DataType.Type.TIMESTAMP) if this.is_string else this
)
parse_without_tz = self.func("DATE_PARSE", value_as_text, self.format_time(expression)) parse_without_tz = self.func("DATE_PARSE", value_as_text, self.format_time(expression))
formatted_value = self.func(
"DATE_FORMAT", value_as_timestamp, self.format_time(expression)
)
parse_with_tz = self.func( parse_with_tz = self.func(
"PARSE_DATETIME", "PARSE_DATETIME",
value_as_text, formatted_value,
self.format_time(expression, Hive.INVERSE_TIME_MAPPING, Hive.INVERSE_TIME_TRIE), self.format_time(expression, Hive.INVERSE_TIME_MAPPING, Hive.INVERSE_TIME_TRIE),
) )
coalesced = self.func("COALESCE", self.func("TRY", parse_without_tz), parse_with_tz) coalesced = self.func("COALESCE", self.func("TRY", parse_without_tz), parse_with_tz)
@ -636,26 +662,6 @@ class Presto(Dialect):
modes = f" {', '.join(modes)}" if modes else "" modes = f" {', '.join(modes)}" if modes else ""
return f"START TRANSACTION{modes}" return f"START TRANSACTION{modes}"
def generateseries_sql(self, expression: exp.GenerateSeries) -> str:
start = expression.args["start"]
end = expression.args["end"]
step = expression.args.get("step")
if isinstance(start, exp.Cast):
target_type = start.to
elif isinstance(end, exp.Cast):
target_type = end.to
else:
target_type = None
if target_type and target_type.is_type("timestamp"):
if target_type is start.to:
end = exp.cast(end, target_type)
else:
start = exp.cast(start, target_type)
return self.func("SEQUENCE", start, end, step)
def offset_limit_modifiers( def offset_limit_modifiers(
self, expression: exp.Expression, fetch: bool, limit: t.Optional[exp.Fetch | exp.Limit] self, expression: exp.Expression, fetch: bool, limit: t.Optional[exp.Fetch | exp.Limit]
) -> t.List[str]: ) -> t.List[str]:

View file

@ -504,43 +504,6 @@ class Snowflake(Dialect):
return lateral return lateral
def _parse_historical_data(self) -> t.Optional[exp.HistoricalData]:
# https://docs.snowflake.com/en/sql-reference/constructs/at-before
index = self._index
historical_data = None
if self._match_texts(self.HISTORICAL_DATA_PREFIX):
this = self._prev.text.upper()
kind = (
self._match(TokenType.L_PAREN)
and self._match_texts(self.HISTORICAL_DATA_KIND)
and self._prev.text.upper()
)
expression = self._match(TokenType.FARROW) and self._parse_bitwise()
if expression:
self._match_r_paren()
historical_data = self.expression(
exp.HistoricalData, this=this, kind=kind, expression=expression
)
else:
self._retreat(index)
return historical_data
def _parse_changes(self) -> t.Optional[exp.Changes]:
if not self._match_text_seq("CHANGES", "(", "INFORMATION", "=>"):
return None
information = self._parse_var(any_token=True)
self._match_r_paren()
return self.expression(
exp.Changes,
information=information,
at_before=self._parse_historical_data(),
end=self._parse_historical_data(),
)
def _parse_table_parts( def _parse_table_parts(
self, schema: bool = False, is_db_reference: bool = False, wildcard: bool = False self, schema: bool = False, is_db_reference: bool = False, wildcard: bool = False
) -> exp.Table: ) -> exp.Table:
@ -573,14 +536,6 @@ class Snowflake(Dialect):
else: else:
table = super()._parse_table_parts(schema=schema, is_db_reference=is_db_reference) table = super()._parse_table_parts(schema=schema, is_db_reference=is_db_reference)
changes = self._parse_changes()
if changes:
table.set("changes", changes)
at_before = self._parse_historical_data()
if at_before:
table.set("when", at_before)
return table return table
def _parse_id_var( def _parse_id_var(
@ -659,7 +614,7 @@ class Snowflake(Dialect):
# can be joined in a query with a comma separator, as well as closing paren # can be joined in a query with a comma separator, as well as closing paren
# in case of subqueries # in case of subqueries
while self._is_connected() and not self._match_set( while self._is_connected() and not self._match_set(
(TokenType.COMMA, TokenType.R_PAREN), advance=False (TokenType.COMMA, TokenType.L_PAREN, TokenType.R_PAREN), advance=False
): ):
parts.append(self._advance_any(ignore_reserved=True)) parts.append(self._advance_any(ignore_reserved=True))

View file

@ -165,9 +165,6 @@ class Spark2(Hive):
"SHUFFLE_REPLICATE_NL": lambda self: self._parse_join_hint("SHUFFLE_REPLICATE_NL"), "SHUFFLE_REPLICATE_NL": lambda self: self._parse_join_hint("SHUFFLE_REPLICATE_NL"),
} }
def _parse_add_column(self) -> t.Optional[exp.Expression]:
return self._match_text_seq("ADD", "COLUMNS") and self._parse_schema()
def _parse_drop_column(self) -> t.Optional[exp.Drop | exp.Command]: def _parse_drop_column(self) -> t.Optional[exp.Drop | exp.Command]:
return self._match_text_seq("DROP", "COLUMNS") and self.expression( return self._match_text_seq("DROP", "COLUMNS") and self.expression(
exp.Drop, this=self._parse_schema(), kind="COLUMNS" exp.Drop, this=self._parse_schema(), kind="COLUMNS"

View file

@ -855,6 +855,7 @@ class TSQL(Dialect):
transforms.eliminate_qualify, transforms.eliminate_qualify,
] ]
), ),
exp.Stddev: rename_func("STDEV"),
exp.StrPosition: lambda self, e: self.func( exp.StrPosition: lambda self, e: self.func(
"CHARINDEX", e.args.get("substr"), e.this, e.args.get("position") "CHARINDEX", e.args.get("substr"), e.this, e.args.get("position")
), ),

View file

@ -33,7 +33,7 @@ from sqlglot.helper import (
seq_get, seq_get,
subclasses, subclasses,
) )
from sqlglot.tokens import Token from sqlglot.tokens import Token, TokenError
if t.TYPE_CHECKING: if t.TYPE_CHECKING:
from sqlglot._typing import E, Lit from sqlglot._typing import E, Lit
@ -1393,6 +1393,8 @@ class Create(DDL):
"begin": False, "begin": False,
"end": False, "end": False,
"clone": False, "clone": False,
"concurrently": False,
"clustered": False,
} }
@property @property
@ -5483,6 +5485,16 @@ class JSONTable(Func):
} }
# https://docs.snowflake.com/en/sql-reference/functions/object_insert
class ObjectInsert(Func):
arg_types = {
"this": True,
"key": True,
"value": True,
"update_flag": False,
}
class OpenJSONColumnDef(Expression): class OpenJSONColumnDef(Expression):
arg_types = {"this": True, "kind": True, "path": False, "as_json": False} arg_types = {"this": True, "kind": True, "path": False, "as_json": False}
@ -5886,7 +5898,7 @@ class Sqrt(Func):
class Stddev(AggFunc): class Stddev(AggFunc):
pass _sql_names = ["STDDEV", "STDEV"]
class StddevPop(AggFunc): class StddevPop(AggFunc):
@ -6881,7 +6893,7 @@ def parse_identifier(name: str | Identifier, dialect: DialectType = None) -> Ide
""" """
try: try:
expression = maybe_parse(name, dialect=dialect, into=Identifier) expression = maybe_parse(name, dialect=dialect, into=Identifier)
except ParseError: except (ParseError, TokenError):
expression = to_identifier(name) expression = to_identifier(name)
return expression return expression

View file

@ -1027,6 +1027,14 @@ class Generator(metaclass=_Generator):
replace = " OR REPLACE" if expression.args.get("replace") else "" replace = " OR REPLACE" if expression.args.get("replace") else ""
unique = " UNIQUE" if expression.args.get("unique") else "" unique = " UNIQUE" if expression.args.get("unique") else ""
clustered = expression.args.get("clustered")
if clustered is None:
clustered_sql = ""
elif clustered:
clustered_sql = " CLUSTERED COLUMNSTORE"
else:
clustered_sql = " NONCLUSTERED COLUMNSTORE"
postcreate_props_sql = "" postcreate_props_sql = ""
if properties_locs.get(exp.Properties.Location.POST_CREATE): if properties_locs.get(exp.Properties.Location.POST_CREATE):
postcreate_props_sql = self.properties( postcreate_props_sql = self.properties(
@ -1036,7 +1044,7 @@ class Generator(metaclass=_Generator):
wrapped=False, wrapped=False,
) )
modifiers = "".join((replace, unique, postcreate_props_sql)) modifiers = "".join((clustered_sql, replace, unique, postcreate_props_sql))
postexpression_props_sql = "" postexpression_props_sql = ""
if properties_locs.get(exp.Properties.Location.POST_EXPRESSION): if properties_locs.get(exp.Properties.Location.POST_EXPRESSION):
@ -1049,6 +1057,7 @@ class Generator(metaclass=_Generator):
wrapped=False, wrapped=False,
) )
concurrently = " CONCURRENTLY" if expression.args.get("concurrently") else ""
exists_sql = " IF NOT EXISTS" if expression.args.get("exists") else "" exists_sql = " IF NOT EXISTS" if expression.args.get("exists") else ""
no_schema_binding = ( no_schema_binding = (
" WITH NO SCHEMA BINDING" if expression.args.get("no_schema_binding") else "" " WITH NO SCHEMA BINDING" if expression.args.get("no_schema_binding") else ""
@ -1057,7 +1066,7 @@ class Generator(metaclass=_Generator):
clone = self.sql(expression, "clone") clone = self.sql(expression, "clone")
clone = f" {clone}" if clone else "" clone = f" {clone}" if clone else ""
expression_sql = f"CREATE{modifiers} {kind}{exists_sql} {this}{properties_sql}{expression_sql}{postexpression_props_sql}{index_sql}{no_schema_binding}{clone}" expression_sql = f"CREATE{modifiers} {kind}{concurrently}{exists_sql} {this}{properties_sql}{expression_sql}{postexpression_props_sql}{index_sql}{no_schema_binding}{clone}"
return self.prepend_ctes(expression, expression_sql) return self.prepend_ctes(expression, expression_sql)
def sequenceproperties_sql(self, expression: exp.SequenceProperties) -> str: def sequenceproperties_sql(self, expression: exp.SequenceProperties) -> str:
@ -1734,8 +1743,7 @@ class Generator(metaclass=_Generator):
alias = f"{sep}{alias}" if alias else "" alias = f"{sep}{alias}" if alias else ""
hints = self.expressions(expression, key="hints", sep=" ") hints = self.expressions(expression, key="hints", sep=" ")
hints = f" {hints}" if hints and self.TABLE_HINTS else "" hints = f" {hints}" if hints and self.TABLE_HINTS else ""
pivots = self.expressions(expression, key="pivots", sep=" ", flat=True) pivots = self.expressions(expression, key="pivots", sep="", flat=True)
pivots = f" {pivots}" if pivots else ""
joins = self.indent( joins = self.indent(
self.expressions(expression, key="joins", sep="", flat=True), skip_first=True self.expressions(expression, key="joins", sep="", flat=True), skip_first=True
) )
@ -1822,7 +1830,7 @@ class Generator(metaclass=_Generator):
alias = self.sql(expression, "alias") alias = self.sql(expression, "alias")
alias = f" AS {alias}" if alias else "" alias = f" AS {alias}" if alias else ""
direction = "UNPIVOT" if expression.unpivot else "PIVOT" direction = self.seg("UNPIVOT" if expression.unpivot else "PIVOT")
field = self.sql(expression, "field") field = self.sql(expression, "field")
include_nulls = expression.args.get("include_nulls") include_nulls = expression.args.get("include_nulls")
if include_nulls is not None: if include_nulls is not None:
@ -2409,10 +2417,7 @@ class Generator(metaclass=_Generator):
def subquery_sql(self, expression: exp.Subquery, sep: str = " AS ") -> str: def subquery_sql(self, expression: exp.Subquery, sep: str = " AS ") -> str:
alias = self.sql(expression, "alias") alias = self.sql(expression, "alias")
alias = f"{sep}{alias}" if alias else "" alias = f"{sep}{alias}" if alias else ""
pivots = self.expressions(expression, key="pivots", sep="", flat=True)
pivots = self.expressions(expression, key="pivots", sep=" ", flat=True)
pivots = f" {pivots}" if pivots else ""
sql = self.query_modifiers(expression, self.wrap(expression), alias, pivots) sql = self.query_modifiers(expression, self.wrap(expression), alias, pivots)
return self.prepend_ctes(expression, sql) return self.prepend_ctes(expression, sql)
@ -3134,6 +3139,7 @@ class Generator(metaclass=_Generator):
expression, expression,
key="actions", key="actions",
prefix="ADD COLUMN ", prefix="ADD COLUMN ",
skip_first=True,
) )
return f"ADD {self.expressions(expression, key='actions', flat=True)}" return f"ADD {self.expressions(expression, key='actions', flat=True)}"

View file

@ -10,10 +10,10 @@ from sqlglot.helper import (
is_iso_date, is_iso_date,
is_iso_datetime, is_iso_datetime,
seq_get, seq_get,
subclasses,
) )
from sqlglot.optimizer.scope import Scope, traverse_scope from sqlglot.optimizer.scope import Scope, traverse_scope
from sqlglot.schema import Schema, ensure_schema from sqlglot.schema import Schema, ensure_schema
from sqlglot.dialects.dialect import Dialect
if t.TYPE_CHECKING: if t.TYPE_CHECKING:
from sqlglot._typing import B, E from sqlglot._typing import B, E
@ -24,12 +24,15 @@ if t.TYPE_CHECKING:
BinaryCoercionFunc, BinaryCoercionFunc,
] ]
from sqlglot.dialects.dialect import DialectType, AnnotatorsType
def annotate_types( def annotate_types(
expression: E, expression: E,
schema: t.Optional[t.Dict | Schema] = None, schema: t.Optional[t.Dict | Schema] = None,
annotators: t.Optional[t.Dict[t.Type[E], t.Callable[[TypeAnnotator, E], E]]] = None, annotators: t.Optional[AnnotatorsType] = None,
coerces_to: t.Optional[t.Dict[exp.DataType.Type, t.Set[exp.DataType.Type]]] = None, coerces_to: t.Optional[t.Dict[exp.DataType.Type, t.Set[exp.DataType.Type]]] = None,
dialect: t.Optional[DialectType] = None,
) -> E: ) -> E:
""" """
Infers the types of an expression, annotating its AST accordingly. Infers the types of an expression, annotating its AST accordingly.
@ -54,11 +57,7 @@ def annotate_types(
schema = ensure_schema(schema) schema = ensure_schema(schema)
return TypeAnnotator(schema, annotators, coerces_to).annotate(expression) return TypeAnnotator(schema, annotators, coerces_to, dialect=dialect).annotate(expression)
def _annotate_with_type_lambda(data_type: exp.DataType.Type) -> t.Callable[[TypeAnnotator, E], E]:
return lambda self, e: self._annotate_with_type(e, data_type)
def _coerce_date_literal(l: exp.Expression, unit: t.Optional[exp.Expression]) -> exp.DataType.Type: def _coerce_date_literal(l: exp.Expression, unit: t.Optional[exp.Expression]) -> exp.DataType.Type:
@ -133,168 +132,6 @@ class _TypeAnnotator(type):
class TypeAnnotator(metaclass=_TypeAnnotator): class TypeAnnotator(metaclass=_TypeAnnotator):
TYPE_TO_EXPRESSIONS: t.Dict[exp.DataType.Type, t.Set[t.Type[exp.Expression]]] = {
exp.DataType.Type.BIGINT: {
exp.ApproxDistinct,
exp.ArraySize,
exp.Count,
exp.Length,
},
exp.DataType.Type.BOOLEAN: {
exp.Between,
exp.Boolean,
exp.In,
exp.RegexpLike,
},
exp.DataType.Type.DATE: {
exp.CurrentDate,
exp.Date,
exp.DateFromParts,
exp.DateStrToDate,
exp.DiToDate,
exp.StrToDate,
exp.TimeStrToDate,
exp.TsOrDsToDate,
},
exp.DataType.Type.DATETIME: {
exp.CurrentDatetime,
exp.Datetime,
exp.DatetimeAdd,
exp.DatetimeSub,
},
exp.DataType.Type.DOUBLE: {
exp.ApproxQuantile,
exp.Avg,
exp.Div,
exp.Exp,
exp.Ln,
exp.Log,
exp.Pow,
exp.Quantile,
exp.Round,
exp.SafeDivide,
exp.Sqrt,
exp.Stddev,
exp.StddevPop,
exp.StddevSamp,
exp.Variance,
exp.VariancePop,
},
exp.DataType.Type.INT: {
exp.Ceil,
exp.DatetimeDiff,
exp.DateDiff,
exp.TimestampDiff,
exp.TimeDiff,
exp.DateToDi,
exp.Floor,
exp.Levenshtein,
exp.Sign,
exp.StrPosition,
exp.TsOrDiToDi,
},
exp.DataType.Type.JSON: {
exp.ParseJSON,
},
exp.DataType.Type.TIME: {
exp.Time,
},
exp.DataType.Type.TIMESTAMP: {
exp.CurrentTime,
exp.CurrentTimestamp,
exp.StrToTime,
exp.TimeAdd,
exp.TimeStrToTime,
exp.TimeSub,
exp.TimestampAdd,
exp.TimestampSub,
exp.UnixToTime,
},
exp.DataType.Type.TINYINT: {
exp.Day,
exp.Month,
exp.Week,
exp.Year,
exp.Quarter,
},
exp.DataType.Type.VARCHAR: {
exp.ArrayConcat,
exp.Concat,
exp.ConcatWs,
exp.DateToDateStr,
exp.GroupConcat,
exp.Initcap,
exp.Lower,
exp.Substring,
exp.TimeToStr,
exp.TimeToTimeStr,
exp.Trim,
exp.TsOrDsToDateStr,
exp.UnixToStr,
exp.UnixToTimeStr,
exp.Upper,
},
}
ANNOTATORS: t.Dict = {
**{
expr_type: lambda self, e: self._annotate_unary(e)
for expr_type in subclasses(exp.__name__, (exp.Unary, exp.Alias))
},
**{
expr_type: lambda self, e: self._annotate_binary(e)
for expr_type in subclasses(exp.__name__, exp.Binary)
},
**{
expr_type: _annotate_with_type_lambda(data_type)
for data_type, expressions in TYPE_TO_EXPRESSIONS.items()
for expr_type in expressions
},
exp.Abs: lambda self, e: self._annotate_by_args(e, "this"),
exp.Anonymous: lambda self, e: self._annotate_with_type(e, exp.DataType.Type.UNKNOWN),
exp.Array: lambda self, e: self._annotate_by_args(e, "expressions", array=True),
exp.ArrayAgg: lambda self, e: self._annotate_by_args(e, "this", array=True),
exp.ArrayConcat: lambda self, e: self._annotate_by_args(e, "this", "expressions"),
exp.Bracket: lambda self, e: self._annotate_bracket(e),
exp.Cast: lambda self, e: self._annotate_with_type(e, e.args["to"]),
exp.Case: lambda self, e: self._annotate_by_args(e, "default", "ifs"),
exp.Coalesce: lambda self, e: self._annotate_by_args(e, "this", "expressions"),
exp.DataType: lambda self, e: self._annotate_with_type(e, e.copy()),
exp.DateAdd: lambda self, e: self._annotate_timeunit(e),
exp.DateSub: lambda self, e: self._annotate_timeunit(e),
exp.DateTrunc: lambda self, e: self._annotate_timeunit(e),
exp.Distinct: lambda self, e: self._annotate_by_args(e, "expressions"),
exp.Div: lambda self, e: self._annotate_div(e),
exp.Dot: lambda self, e: self._annotate_dot(e),
exp.Explode: lambda self, e: self._annotate_explode(e),
exp.Extract: lambda self, e: self._annotate_extract(e),
exp.Filter: lambda self, e: self._annotate_by_args(e, "this"),
exp.GenerateDateArray: lambda self, e: self._annotate_with_type(
e, exp.DataType.build("ARRAY<DATE>")
),
exp.If: lambda self, e: self._annotate_by_args(e, "true", "false"),
exp.Interval: lambda self, e: self._annotate_with_type(e, exp.DataType.Type.INTERVAL),
exp.Least: lambda self, e: self._annotate_by_args(e, "expressions"),
exp.Literal: lambda self, e: self._annotate_literal(e),
exp.Map: lambda self, e: self._annotate_map(e),
exp.Max: lambda self, e: self._annotate_by_args(e, "this", "expressions"),
exp.Min: lambda self, e: self._annotate_by_args(e, "this", "expressions"),
exp.Null: lambda self, e: self._annotate_with_type(e, exp.DataType.Type.NULL),
exp.Nullif: lambda self, e: self._annotate_by_args(e, "this", "expression"),
exp.PropertyEQ: lambda self, e: self._annotate_by_args(e, "expression"),
exp.Slice: lambda self, e: self._annotate_with_type(e, exp.DataType.Type.UNKNOWN),
exp.Struct: lambda self, e: self._annotate_struct(e),
exp.Sum: lambda self, e: self._annotate_by_args(e, "this", "expressions", promote=True),
exp.Timestamp: lambda self, e: self._annotate_with_type(
e,
exp.DataType.Type.TIMESTAMPTZ if e.args.get("with_tz") else exp.DataType.Type.TIMESTAMP,
),
exp.ToMap: lambda self, e: self._annotate_to_map(e),
exp.TryCast: lambda self, e: self._annotate_with_type(e, e.args["to"]),
exp.Unnest: lambda self, e: self._annotate_unnest(e),
exp.VarMap: lambda self, e: self._annotate_map(e),
}
NESTED_TYPES = { NESTED_TYPES = {
exp.DataType.Type.ARRAY, exp.DataType.Type.ARRAY,
} }
@ -335,12 +172,13 @@ class TypeAnnotator(metaclass=_TypeAnnotator):
def __init__( def __init__(
self, self,
schema: Schema, schema: Schema,
annotators: t.Optional[t.Dict[t.Type[E], t.Callable[[TypeAnnotator, E], E]]] = None, annotators: t.Optional[AnnotatorsType] = None,
coerces_to: t.Optional[t.Dict[exp.DataType.Type, t.Set[exp.DataType.Type]]] = None, coerces_to: t.Optional[t.Dict[exp.DataType.Type, t.Set[exp.DataType.Type]]] = None,
binary_coercions: t.Optional[BinaryCoercions] = None, binary_coercions: t.Optional[BinaryCoercions] = None,
dialect: t.Optional[DialectType] = None,
) -> None: ) -> None:
self.schema = schema self.schema = schema
self.annotators = annotators or self.ANNOTATORS self.annotators = annotators or Dialect.get_or_raise(dialect).ANNOTATORS
self.coerces_to = coerces_to or self.COERCES_TO self.coerces_to = coerces_to or self.COERCES_TO
self.binary_coercions = binary_coercions or self.BINARY_COERCIONS self.binary_coercions = binary_coercions or self.BINARY_COERCIONS
@ -483,7 +321,9 @@ class TypeAnnotator(metaclass=_TypeAnnotator):
return expression return expression
def _annotate_with_type(self, expression: E, target_type: exp.DataType.Type) -> E: def _annotate_with_type(
self, expression: E, target_type: exp.DataType | exp.DataType.Type
) -> E:
self._set_type(expression, target_type) self._set_type(expression, target_type)
return self._annotate_args(expression) return self._annotate_args(expression)

View file

@ -376,6 +376,7 @@ class Parser(metaclass=_Parser):
# Tokens that can represent identifiers # Tokens that can represent identifiers
ID_VAR_TOKENS = { ID_VAR_TOKENS = {
TokenType.ALL,
TokenType.VAR, TokenType.VAR,
TokenType.ANTI, TokenType.ANTI,
TokenType.APPLY, TokenType.APPLY,
@ -929,7 +930,8 @@ class Parser(metaclass=_Parser):
enforced=self._match_text_seq("ENFORCED"), enforced=self._match_text_seq("ENFORCED"),
), ),
"COLLATE": lambda self: self.expression( "COLLATE": lambda self: self.expression(
exp.CollateColumnConstraint, this=self._parse_var(any_token=True) exp.CollateColumnConstraint,
this=self._parse_identifier() or self._parse_column(),
), ),
"COMMENT": lambda self: self.expression( "COMMENT": lambda self: self.expression(
exp.CommentColumnConstraint, this=self._parse_string() exp.CommentColumnConstraint, this=self._parse_string()
@ -1138,7 +1140,9 @@ class Parser(metaclass=_Parser):
ISOLATED_LOADING_OPTIONS: OPTIONS_TYPE = {"FOR": ("ALL", "INSERT", "NONE")} ISOLATED_LOADING_OPTIONS: OPTIONS_TYPE = {"FOR": ("ALL", "INSERT", "NONE")}
USABLES: OPTIONS_TYPE = dict.fromkeys(("ROLE", "WAREHOUSE", "DATABASE", "SCHEMA"), tuple()) USABLES: OPTIONS_TYPE = dict.fromkeys(
("ROLE", "WAREHOUSE", "DATABASE", "SCHEMA", "CATALOG"), tuple()
)
CAST_ACTIONS: OPTIONS_TYPE = dict.fromkeys(("RENAME", "ADD"), ("FIELDS",)) CAST_ACTIONS: OPTIONS_TYPE = dict.fromkeys(("RENAME", "ADD"), ("FIELDS",))
@ -1147,6 +1151,17 @@ class Parser(metaclass=_Parser):
**dict.fromkeys(("BINDING", "COMPENSATION", "EVOLUTION"), tuple()), **dict.fromkeys(("BINDING", "COMPENSATION", "EVOLUTION"), tuple()),
} }
KEY_CONSTRAINT_OPTIONS: OPTIONS_TYPE = {
"NOT": ("ENFORCED",),
"MATCH": (
"FULL",
"PARTIAL",
"SIMPLE",
),
"INITIALLY": ("DEFERRED", "IMMEDIATE"),
**dict.fromkeys(("DEFERRABLE", "NORELY"), tuple()),
}
INSERT_ALTERNATIVES = {"ABORT", "FAIL", "IGNORE", "REPLACE", "ROLLBACK"} INSERT_ALTERNATIVES = {"ABORT", "FAIL", "IGNORE", "REPLACE", "ROLLBACK"}
CLONE_KEYWORDS = {"CLONE", "COPY"} CLONE_KEYWORDS = {"CLONE", "COPY"}
@ -1663,6 +1678,15 @@ class Parser(metaclass=_Parser):
unique = self._match(TokenType.UNIQUE) unique = self._match(TokenType.UNIQUE)
if self._match_text_seq("CLUSTERED", "COLUMNSTORE"):
clustered = True
elif self._match_text_seq("NONCLUSTERED", "COLUMNSTORE") or self._match_text_seq(
"COLUMNSTORE"
):
clustered = False
else:
clustered = None
if self._match_pair(TokenType.TABLE, TokenType.FUNCTION, advance=False): if self._match_pair(TokenType.TABLE, TokenType.FUNCTION, advance=False):
self._advance() self._advance()
@ -1677,6 +1701,7 @@ class Parser(metaclass=_Parser):
if not properties or not create_token: if not properties or not create_token:
return self._parse_as_command(start) return self._parse_as_command(start)
concurrently = self._match_text_seq("CONCURRENTLY")
exists = self._parse_exists(not_=True) exists = self._parse_exists(not_=True)
this = None this = None
expression: t.Optional[exp.Expression] = None expression: t.Optional[exp.Expression] = None
@ -1802,6 +1827,8 @@ class Parser(metaclass=_Parser):
begin=begin, begin=begin,
end=end, end=end,
clone=clone, clone=clone,
concurrently=concurrently,
clustered=clustered,
) )
def _parse_sequence_properties(self) -> t.Optional[exp.SequenceProperties]: def _parse_sequence_properties(self) -> t.Optional[exp.SequenceProperties]:
@ -2728,8 +2755,12 @@ class Parser(metaclass=_Parser):
comments = self._prev_comments comments = self._prev_comments
hint = self._parse_hint() hint = self._parse_hint()
if self._next and not self._next.token_type == TokenType.DOT:
all_ = self._match(TokenType.ALL) all_ = self._match(TokenType.ALL)
distinct = self._match_set(self.DISTINCT_TOKENS) distinct = self._match_set(self.DISTINCT_TOKENS)
else:
all_, distinct = None, None
kind = ( kind = (
self._match(TokenType.ALIAS) self._match(TokenType.ALIAS)
@ -2827,6 +2858,7 @@ class Parser(metaclass=_Parser):
self.raise_error("Expected CTE to have alias") self.raise_error("Expected CTE to have alias")
self._match(TokenType.ALIAS) self._match(TokenType.ALIAS)
comments = self._prev_comments
if self._match_text_seq("NOT", "MATERIALIZED"): if self._match_text_seq("NOT", "MATERIALIZED"):
materialized = False materialized = False
@ -2840,6 +2872,7 @@ class Parser(metaclass=_Parser):
this=self._parse_wrapped(self._parse_statement), this=self._parse_wrapped(self._parse_statement),
alias=alias, alias=alias,
materialized=materialized, materialized=materialized,
comments=comments,
) )
def _parse_table_alias( def _parse_table_alias(
@ -3352,15 +3385,28 @@ class Parser(metaclass=_Parser):
if not db and is_db_reference: if not db and is_db_reference:
self.raise_error(f"Expected database name but got {self._curr}") self.raise_error(f"Expected database name but got {self._curr}")
return self.expression( table = self.expression(
exp.Table, exp.Table,
comments=comments, comments=comments,
this=table, this=table,
db=db, db=db,
catalog=catalog, catalog=catalog,
pivots=self._parse_pivots(),
) )
changes = self._parse_changes()
if changes:
table.set("changes", changes)
at_before = self._parse_historical_data()
if at_before:
table.set("when", at_before)
pivots = self._parse_pivots()
if pivots:
table.set("pivots", pivots)
return table
def _parse_table( def _parse_table(
self, self,
schema: bool = False, schema: bool = False,
@ -3490,6 +3536,43 @@ class Parser(metaclass=_Parser):
return self.expression(exp.Version, this=this, expression=expression, kind=kind) return self.expression(exp.Version, this=this, expression=expression, kind=kind)
def _parse_historical_data(self) -> t.Optional[exp.HistoricalData]:
# https://docs.snowflake.com/en/sql-reference/constructs/at-before
index = self._index
historical_data = None
if self._match_texts(self.HISTORICAL_DATA_PREFIX):
this = self._prev.text.upper()
kind = (
self._match(TokenType.L_PAREN)
and self._match_texts(self.HISTORICAL_DATA_KIND)
and self._prev.text.upper()
)
expression = self._match(TokenType.FARROW) and self._parse_bitwise()
if expression:
self._match_r_paren()
historical_data = self.expression(
exp.HistoricalData, this=this, kind=kind, expression=expression
)
else:
self._retreat(index)
return historical_data
def _parse_changes(self) -> t.Optional[exp.Changes]:
if not self._match_text_seq("CHANGES", "(", "INFORMATION", "=>"):
return None
information = self._parse_var(any_token=True)
self._match_r_paren()
return self.expression(
exp.Changes,
information=information,
at_before=self._parse_historical_data(),
end=self._parse_historical_data(),
)
def _parse_unnest(self, with_alias: bool = True) -> t.Optional[exp.Unnest]: def _parse_unnest(self, with_alias: bool = True) -> t.Optional[exp.Unnest]:
if not self._match(TokenType.UNNEST): if not self._match(TokenType.UNNEST):
return None return None
@ -5216,18 +5299,13 @@ class Parser(metaclass=_Parser):
self.raise_error("Invalid key constraint") self.raise_error("Invalid key constraint")
options.append(f"ON {on} {action}") options.append(f"ON {on} {action}")
elif self._match_text_seq("NOT", "ENFORCED"):
options.append("NOT ENFORCED")
elif self._match_text_seq("DEFERRABLE"):
options.append("DEFERRABLE")
elif self._match_text_seq("INITIALLY", "DEFERRED"):
options.append("INITIALLY DEFERRED")
elif self._match_text_seq("NORELY"):
options.append("NORELY")
elif self._match_text_seq("MATCH", "FULL"):
options.append("MATCH FULL")
else: else:
var = self._parse_var_from_options(
self.KEY_CONSTRAINT_OPTIONS, raise_unmatched=False
)
if not var:
break break
options.append(var.name)
return options return options
@ -6227,6 +6305,13 @@ class Parser(metaclass=_Parser):
self._retreat(index) self._retreat(index)
if not self.ALTER_TABLE_ADD_REQUIRED_FOR_EACH_COLUMN and self._match_text_seq("ADD"): if not self.ALTER_TABLE_ADD_REQUIRED_FOR_EACH_COLUMN and self._match_text_seq("ADD"):
return self._parse_wrapped_csv(self._parse_field_def, optional=True) return self._parse_wrapped_csv(self._parse_field_def, optional=True)
if self._match_text_seq("ADD", "COLUMNS"):
schema = self._parse_schema()
if schema:
return [schema]
return []
return self._parse_wrapped_csv(self._parse_add_column, optional=True) return self._parse_wrapped_csv(self._parse_add_column, optional=True)
def _parse_alter_table_alter(self) -> t.Optional[exp.Expression]: def _parse_alter_table_alter(self) -> t.Optional[exp.Expression]:

View file

@ -229,6 +229,23 @@ def unqualify_unnest(expression: exp.Expression) -> exp.Expression:
def unnest_to_explode(expression: exp.Expression) -> exp.Expression: def unnest_to_explode(expression: exp.Expression) -> exp.Expression:
"""Convert cross join unnest into lateral view explode.""" """Convert cross join unnest into lateral view explode."""
if isinstance(expression, exp.Select): if isinstance(expression, exp.Select):
from_ = expression.args.get("from")
if from_ and isinstance(from_.this, exp.Unnest):
unnest = from_.this
alias = unnest.args.get("alias")
udtf = exp.Posexplode if unnest.args.get("offset") else exp.Explode
this, *expressions = unnest.expressions
unnest.replace(
exp.Table(
this=udtf(
this=this,
expressions=expressions,
),
alias=exp.TableAlias(this=alias.this, columns=alias.columns) if alias else None,
)
)
for join in expression.args.get("joins") or []: for join in expression.args.get("joins") or []:
unnest = join.this unnest = join.this

View file

@ -612,6 +612,7 @@ LANGUAGE js AS
write={ write={
"bigquery": "SELECT DATETIME_ADD('2023-01-01T00:00:00', INTERVAL 1 MILLISECOND)", "bigquery": "SELECT DATETIME_ADD('2023-01-01T00:00:00', INTERVAL 1 MILLISECOND)",
"databricks": "SELECT TIMESTAMPADD(MILLISECOND, 1, '2023-01-01T00:00:00')", "databricks": "SELECT TIMESTAMPADD(MILLISECOND, 1, '2023-01-01T00:00:00')",
"duckdb": "SELECT CAST('2023-01-01T00:00:00' AS DATETIME) + INTERVAL 1 MILLISECOND",
}, },
), ),
) )
@ -621,6 +622,7 @@ LANGUAGE js AS
write={ write={
"bigquery": "SELECT DATETIME_SUB('2023-01-01T00:00:00', INTERVAL 1 MILLISECOND)", "bigquery": "SELECT DATETIME_SUB('2023-01-01T00:00:00', INTERVAL 1 MILLISECOND)",
"databricks": "SELECT TIMESTAMPADD(MILLISECOND, 1 * -1, '2023-01-01T00:00:00')", "databricks": "SELECT TIMESTAMPADD(MILLISECOND, 1 * -1, '2023-01-01T00:00:00')",
"duckdb": "SELECT CAST('2023-01-01T00:00:00' AS DATETIME) - INTERVAL 1 MILLISECOND",
}, },
), ),
) )
@ -1016,7 +1018,7 @@ LANGUAGE js AS
write={ write={
"bigquery": "SELECT * FROM UNNEST(['7', '14']) AS x", "bigquery": "SELECT * FROM UNNEST(['7', '14']) AS x",
"presto": "SELECT * FROM UNNEST(ARRAY['7', '14']) AS _t0(x)", "presto": "SELECT * FROM UNNEST(ARRAY['7', '14']) AS _t0(x)",
"spark": "SELECT * FROM UNNEST(ARRAY('7', '14')) AS _t0(x)", "spark": "SELECT * FROM EXPLODE(ARRAY('7', '14')) AS _t0(x)",
}, },
) )
self.validate_all( self.validate_all(

View file

@ -7,6 +7,9 @@ class TestClickhouse(Validator):
dialect = "clickhouse" dialect = "clickhouse"
def test_clickhouse(self): def test_clickhouse(self):
self.validate_identity("SELECT toFloat(like)")
self.validate_identity("SELECT like")
string_types = [ string_types = [
"BLOB", "BLOB",
"LONGBLOB", "LONGBLOB",

View file

@ -7,6 +7,7 @@ class TestDatabricks(Validator):
dialect = "databricks" dialect = "databricks"
def test_databricks(self): def test_databricks(self):
self.validate_identity("ALTER TABLE labels ADD COLUMN label_score FLOAT")
self.validate_identity("DESCRIBE HISTORY a.b") self.validate_identity("DESCRIBE HISTORY a.b")
self.validate_identity("DESCRIBE history.tbl") self.validate_identity("DESCRIBE history.tbl")
self.validate_identity("CREATE TABLE t (a STRUCT<c: MAP<STRING, STRING>>)") self.validate_identity("CREATE TABLE t (a STRUCT<c: MAP<STRING, STRING>>)")

View file

@ -628,7 +628,7 @@ class TestDialect(Validator):
write={ write={
"duckdb": "EPOCH(STRPTIME('2020-01-01', '%Y-%m-%d'))", "duckdb": "EPOCH(STRPTIME('2020-01-01', '%Y-%m-%d'))",
"hive": "UNIX_TIMESTAMP('2020-01-01', 'yyyy-MM-dd')", "hive": "UNIX_TIMESTAMP('2020-01-01', 'yyyy-MM-dd')",
"presto": "TO_UNIXTIME(COALESCE(TRY(DATE_PARSE(CAST('2020-01-01' AS VARCHAR), '%Y-%m-%d')), PARSE_DATETIME(CAST('2020-01-01' AS VARCHAR), 'yyyy-MM-dd')))", "presto": "TO_UNIXTIME(COALESCE(TRY(DATE_PARSE(CAST('2020-01-01' AS VARCHAR), '%Y-%m-%d')), PARSE_DATETIME(DATE_FORMAT(CAST('2020-01-01' AS TIMESTAMP), '%Y-%m-%d'), 'yyyy-MM-dd')))",
"starrocks": "UNIX_TIMESTAMP('2020-01-01', '%Y-%m-%d')", "starrocks": "UNIX_TIMESTAMP('2020-01-01', '%Y-%m-%d')",
"doris": "UNIX_TIMESTAMP('2020-01-01', '%Y-%m-%d')", "doris": "UNIX_TIMESTAMP('2020-01-01', '%Y-%m-%d')",
}, },

View file

@ -308,6 +308,14 @@ class TestDuckDB(Validator):
"SELECT JSON_EXTRACT(c, '$.k1') = 'v1'", "SELECT JSON_EXTRACT(c, '$.k1') = 'v1'",
"SELECT (c -> '$.k1') = 'v1'", "SELECT (c -> '$.k1') = 'v1'",
) )
self.validate_identity(
"SELECT JSON_EXTRACT(c, '$[*].id')[0:2]",
"SELECT (c -> '$[*].id')[0 : 2]",
)
self.validate_identity(
"SELECT JSON_EXTRACT_STRING(c, '$[*].id')[0:2]",
"SELECT (c ->> '$[*].id')[0 : 2]",
)
self.validate_identity( self.validate_identity(
"""SELECT '{"foo": [1, 2, 3]}' -> 'foo' -> 0""", """SELECT '{"foo": [1, 2, 3]}' -> 'foo' -> 0""",
"""SELECT '{"foo": [1, 2, 3]}' -> '$.foo' -> '$[0]'""", """SELECT '{"foo": [1, 2, 3]}' -> '$.foo' -> '$[0]'""",
@ -1048,7 +1056,14 @@ class TestDuckDB(Validator):
"CAST([STRUCT_PACK(a := 1)] AS STRUCT(a BIGINT)[])", "CAST([STRUCT_PACK(a := 1)] AS STRUCT(a BIGINT)[])",
"CAST([ROW(1)] AS STRUCT(a BIGINT)[])", "CAST([ROW(1)] AS STRUCT(a BIGINT)[])",
) )
self.validate_identity(
"STRUCT_PACK(a := 'b')::json",
"CAST({'a': 'b'} AS JSON)",
)
self.validate_identity(
"STRUCT_PACK(a := 'b')::STRUCT(a TEXT)",
"CAST(ROW('b') AS STRUCT(a TEXT))",
)
self.validate_all( self.validate_all(
"CAST(x AS VARCHAR(5))", "CAST(x AS VARCHAR(5))",
write={ write={

View file

@ -372,7 +372,7 @@ class TestHive(Validator):
"UNIX_TIMESTAMP(x)", "UNIX_TIMESTAMP(x)",
write={ write={
"duckdb": "EPOCH(STRPTIME(x, '%Y-%m-%d %H:%M:%S'))", "duckdb": "EPOCH(STRPTIME(x, '%Y-%m-%d %H:%M:%S'))",
"presto": "TO_UNIXTIME(COALESCE(TRY(DATE_PARSE(CAST(x AS VARCHAR), '%Y-%m-%d %T')), PARSE_DATETIME(CAST(x AS VARCHAR), 'yyyy-MM-dd HH:mm:ss')))", "presto": "TO_UNIXTIME(COALESCE(TRY(DATE_PARSE(CAST(x AS VARCHAR), '%Y-%m-%d %T')), PARSE_DATETIME(DATE_FORMAT(x, '%Y-%m-%d %T'), 'yyyy-MM-dd HH:mm:ss')))",
"hive": "UNIX_TIMESTAMP(x)", "hive": "UNIX_TIMESTAMP(x)",
"spark": "UNIX_TIMESTAMP(x)", "spark": "UNIX_TIMESTAMP(x)",
"": "STR_TO_UNIX(x, '%Y-%m-%d %H:%M:%S')", "": "STR_TO_UNIX(x, '%Y-%m-%d %H:%M:%S')",

View file

@ -24,6 +24,9 @@ class TestMySQL(Validator):
self.validate_identity("ALTER TABLE t ADD INDEX `i` (`c`)") self.validate_identity("ALTER TABLE t ADD INDEX `i` (`c`)")
self.validate_identity("ALTER TABLE t ADD UNIQUE `i` (`c`)") self.validate_identity("ALTER TABLE t ADD UNIQUE `i` (`c`)")
self.validate_identity("ALTER TABLE test_table MODIFY COLUMN test_column LONGTEXT") self.validate_identity("ALTER TABLE test_table MODIFY COLUMN test_column LONGTEXT")
self.validate_identity(
"INSERT INTO things (a, b) VALUES (1, 2) AS new_data ON DUPLICATE KEY UPDATE id = LAST_INSERT_ID(id), a = new_data.a, b = new_data.b"
)
self.validate_identity( self.validate_identity(
"CREATE TABLE `oauth_consumer` (`key` VARCHAR(32) NOT NULL, UNIQUE `OAUTH_CONSUMER_KEY` (`key`))" "CREATE TABLE `oauth_consumer` (`key` VARCHAR(32) NOT NULL, UNIQUE `OAUTH_CONSUMER_KEY` (`key`))"
) )

View file

@ -17,9 +17,6 @@ class TestPostgres(Validator):
) )
self.validate_identity("SHA384(x)") self.validate_identity("SHA384(x)")
self.validate_identity(
'CREATE TABLE x (a TEXT COLLATE "de_DE")', "CREATE TABLE x (a TEXT COLLATE de_DE)"
)
self.validate_identity("1.x", "1. AS x") self.validate_identity("1.x", "1. AS x")
self.validate_identity("|/ x", "SQRT(x)") self.validate_identity("|/ x", "SQRT(x)")
self.validate_identity("||/ x", "CBRT(x)") self.validate_identity("||/ x", "CBRT(x)")
@ -565,15 +562,10 @@ class TestPostgres(Validator):
"postgres": "GENERATE_SERIES(CAST('2019-01-01' AS TIMESTAMP), CURRENT_TIMESTAMP, INTERVAL '1 DAY')", "postgres": "GENERATE_SERIES(CAST('2019-01-01' AS TIMESTAMP), CURRENT_TIMESTAMP, INTERVAL '1 DAY')",
"presto": "SEQUENCE(CAST('2019-01-01' AS TIMESTAMP), CAST(CURRENT_TIMESTAMP AS TIMESTAMP), INTERVAL '1' DAY)", "presto": "SEQUENCE(CAST('2019-01-01' AS TIMESTAMP), CAST(CURRENT_TIMESTAMP AS TIMESTAMP), INTERVAL '1' DAY)",
"trino": "SEQUENCE(CAST('2019-01-01' AS TIMESTAMP), CAST(CURRENT_TIMESTAMP AS TIMESTAMP), INTERVAL '1' DAY)", "trino": "SEQUENCE(CAST('2019-01-01' AS TIMESTAMP), CAST(CURRENT_TIMESTAMP AS TIMESTAMP), INTERVAL '1' DAY)",
}, "hive": "SEQUENCE(CAST('2019-01-01' AS TIMESTAMP), CAST(CURRENT_TIMESTAMP() AS TIMESTAMP), INTERVAL '1' DAY)",
) "spark2": "SEQUENCE(CAST('2019-01-01' AS TIMESTAMP), CAST(CURRENT_TIMESTAMP() AS TIMESTAMP), INTERVAL '1' DAY)",
self.validate_all( "spark": "SEQUENCE(CAST('2019-01-01' AS TIMESTAMP), CAST(CURRENT_TIMESTAMP() AS TIMESTAMP), INTERVAL '1' DAY)",
"GENERATE_SERIES(a, b)", "databricks": "SEQUENCE(CAST('2019-01-01' AS TIMESTAMP), CAST(CURRENT_TIMESTAMP() AS TIMESTAMP), INTERVAL '1' DAY)",
write={
"postgres": "GENERATE_SERIES(a, b)",
"presto": "SEQUENCE(a, b)",
"trino": "SEQUENCE(a, b)",
"tsql": "GENERATE_SERIES(a, b)",
}, },
) )
self.validate_all( self.validate_all(
@ -583,6 +575,20 @@ class TestPostgres(Validator):
"presto": "SEQUENCE(a, b)", "presto": "SEQUENCE(a, b)",
"trino": "SEQUENCE(a, b)", "trino": "SEQUENCE(a, b)",
"tsql": "GENERATE_SERIES(a, b)", "tsql": "GENERATE_SERIES(a, b)",
"hive": "SEQUENCE(a, b)",
"spark2": "SEQUENCE(a, b)",
"spark": "SEQUENCE(a, b)",
"databricks": "SEQUENCE(a, b)",
},
write={
"postgres": "GENERATE_SERIES(a, b)",
"presto": "SEQUENCE(a, b)",
"trino": "SEQUENCE(a, b)",
"tsql": "GENERATE_SERIES(a, b)",
"hive": "SEQUENCE(a, b)",
"spark2": "SEQUENCE(a, b)",
"spark": "SEQUENCE(a, b)",
"databricks": "SEQUENCE(a, b)",
}, },
) )
self.validate_all( self.validate_all(
@ -759,6 +765,14 @@ class TestPostgres(Validator):
}, },
) )
self.validate_all(
"SELECT TO_DATE('01/01/2000', 'MM/DD/YYYY')",
write={
"duckdb": "SELECT CAST(STRPTIME('01/01/2000', '%m/%d/%Y') AS DATE)",
"postgres": "SELECT TO_DATE('01/01/2000', 'MM/DD/YYYY')",
},
)
def test_ddl(self): def test_ddl(self):
# Checks that user-defined types are parsed into DataType instead of Identifier # Checks that user-defined types are parsed into DataType instead of Identifier
self.parse_one("CREATE TABLE t (a udt)").this.expressions[0].args["kind"].assert_is( self.parse_one("CREATE TABLE t (a udt)").this.expressions[0].args["kind"].assert_is(
@ -775,6 +789,8 @@ class TestPostgres(Validator):
cdef.args["kind"].assert_is(exp.DataType) cdef.args["kind"].assert_is(exp.DataType)
self.assertEqual(expr.sql(dialect="postgres"), "CREATE TABLE t (x INTERVAL DAY)") self.assertEqual(expr.sql(dialect="postgres"), "CREATE TABLE t (x INTERVAL DAY)")
self.validate_identity('CREATE TABLE x (a TEXT COLLATE "de_DE")')
self.validate_identity('CREATE TABLE x (a TEXT COLLATE pg_catalog."default")')
self.validate_identity("CREATE TABLE t (col INT[3][5])") self.validate_identity("CREATE TABLE t (col INT[3][5])")
self.validate_identity("CREATE TABLE t (col INT[3])") self.validate_identity("CREATE TABLE t (col INT[3])")
self.validate_identity("CREATE INDEX IF NOT EXISTS ON t(c)") self.validate_identity("CREATE INDEX IF NOT EXISTS ON t(c)")
@ -981,6 +997,34 @@ class TestPostgres(Validator):
self.validate_identity("CREATE TABLE tbl (col UUID UNIQUE DEFAULT GEN_RANDOM_UUID())") self.validate_identity("CREATE TABLE tbl (col UUID UNIQUE DEFAULT GEN_RANDOM_UUID())")
self.validate_identity("CREATE TABLE tbl (col UUID, UNIQUE NULLS NOT DISTINCT (col))") self.validate_identity("CREATE TABLE tbl (col UUID, UNIQUE NULLS NOT DISTINCT (col))")
self.validate_identity("CREATE INDEX CONCURRENTLY ix_table_id ON tbl USING btree(id)")
self.validate_identity(
"CREATE INDEX CONCURRENTLY IF NOT EXISTS ix_table_id ON tbl USING btree(id)"
)
self.validate_identity(
"""
CREATE TABLE IF NOT EXISTS public.rental
(
inventory_id INT NOT NULL,
CONSTRAINT rental_customer_id_fkey FOREIGN KEY (customer_id)
REFERENCES public.customer (customer_id) MATCH FULL
ON UPDATE CASCADE
ON DELETE RESTRICT,
CONSTRAINT rental_inventory_id_fkey FOREIGN KEY (inventory_id)
REFERENCES public.inventory (inventory_id) MATCH PARTIAL
ON UPDATE CASCADE
ON DELETE RESTRICT,
CONSTRAINT rental_staff_id_fkey FOREIGN KEY (staff_id)
REFERENCES public.staff (staff_id) MATCH SIMPLE
ON UPDATE CASCADE
ON DELETE RESTRICT,
INITIALLY IMMEDIATE
)
""",
"CREATE TABLE IF NOT EXISTS public.rental (inventory_id INT NOT NULL, CONSTRAINT rental_customer_id_fkey FOREIGN KEY (customer_id) REFERENCES public.customer (customer_id) MATCH FULL ON UPDATE CASCADE ON DELETE RESTRICT, CONSTRAINT rental_inventory_id_fkey FOREIGN KEY (inventory_id) REFERENCES public.inventory (inventory_id) MATCH PARTIAL ON UPDATE CASCADE ON DELETE RESTRICT, CONSTRAINT rental_staff_id_fkey FOREIGN KEY (staff_id) REFERENCES public.staff (staff_id) MATCH SIMPLE ON UPDATE CASCADE ON DELETE RESTRICT, INITIALLY IMMEDIATE)",
)
with self.assertRaises(ParseError): with self.assertRaises(ParseError):
transpile("CREATE TABLE products (price DECIMAL CHECK price > 0)", read="postgres") transpile("CREATE TABLE products (price DECIMAL CHECK price > 0)", read="postgres")
with self.assertRaises(ParseError): with self.assertRaises(ParseError):

View file

@ -413,6 +413,19 @@ class TestPresto(Validator):
}, },
) )
self.validate_identity("DATE_ADD('DAY', FLOOR(5), y)")
self.validate_identity(
"""SELECT DATE_ADD('DAY', MOD(5, 2.5), y), DATE_ADD('DAY', CEIL(5.5), y)""",
"""SELECT DATE_ADD('DAY', CAST(5 % 2.5 AS BIGINT), y), DATE_ADD('DAY', CAST(CEIL(5.5) AS BIGINT), y)""",
)
self.validate_all(
"DATE_ADD('MINUTE', CAST(FLOOR(CAST(EXTRACT(MINUTE FROM CURRENT_TIMESTAMP) AS DOUBLE) / NULLIF(30, 0)) * 30 AS BIGINT), col)",
read={
"spark": "TIMESTAMPADD(MINUTE, FLOOR(EXTRACT(MINUTE FROM CURRENT_TIMESTAMP)/30)*30, col)",
},
)
def test_ddl(self): def test_ddl(self):
self.validate_all( self.validate_all(
"CREATE TABLE test WITH (FORMAT = 'PARQUET') AS SELECT 1", "CREATE TABLE test WITH (FORMAT = 'PARQUET') AS SELECT 1",
@ -942,8 +955,8 @@ class TestPresto(Validator):
write={ write={
"bigquery": "SELECT * FROM UNNEST(['7', '14'])", "bigquery": "SELECT * FROM UNNEST(['7', '14'])",
"presto": "SELECT * FROM UNNEST(ARRAY['7', '14']) AS x", "presto": "SELECT * FROM UNNEST(ARRAY['7', '14']) AS x",
"hive": "SELECT * FROM UNNEST(ARRAY('7', '14')) AS x", "hive": "SELECT * FROM EXPLODE(ARRAY('7', '14')) AS x",
"spark": "SELECT * FROM UNNEST(ARRAY('7', '14')) AS x", "spark": "SELECT * FROM EXPLODE(ARRAY('7', '14')) AS x",
}, },
) )
self.validate_all( self.validate_all(
@ -951,8 +964,8 @@ class TestPresto(Validator):
write={ write={
"bigquery": "SELECT * FROM UNNEST(['7', '14']) AS y", "bigquery": "SELECT * FROM UNNEST(['7', '14']) AS y",
"presto": "SELECT * FROM UNNEST(ARRAY['7', '14']) AS x(y)", "presto": "SELECT * FROM UNNEST(ARRAY['7', '14']) AS x(y)",
"hive": "SELECT * FROM UNNEST(ARRAY('7', '14')) AS x(y)", "hive": "SELECT * FROM EXPLODE(ARRAY('7', '14')) AS x(y)",
"spark": "SELECT * FROM UNNEST(ARRAY('7', '14')) AS x(y)", "spark": "SELECT * FROM EXPLODE(ARRAY('7', '14')) AS x(y)",
}, },
) )
self.validate_all( self.validate_all(

View file

@ -11,6 +11,10 @@ class TestSnowflake(Validator):
dialect = "snowflake" dialect = "snowflake"
def test_snowflake(self): def test_snowflake(self):
self.validate_identity(
"SELECT * FROM table AT (TIMESTAMP => '2024-07-24') UNPIVOT(a FOR b IN (c)) AS pivot_table"
)
self.assertEqual( self.assertEqual(
# Ensures we don't fail when generating ParseJSON with the `safe` arg set to `True` # Ensures we don't fail when generating ParseJSON with the `safe` arg set to `True`
self.validate_identity("""SELECT TRY_PARSE_JSON('{"x: 1}')""").sql(), self.validate_identity("""SELECT TRY_PARSE_JSON('{"x: 1}')""").sql(),
@ -827,6 +831,22 @@ WHERE
}, },
) )
self.validate_all(
"SELECT OBJECT_INSERT(OBJECT_INSERT(OBJECT_INSERT(OBJECT_CONSTRUCT('key5', 'value5'), 'key1', 5), 'key2', 2.2), 'key3', 'value3')",
write={
"snowflake": "SELECT OBJECT_INSERT(OBJECT_INSERT(OBJECT_INSERT(OBJECT_CONSTRUCT('key5', 'value5'), 'key1', 5), 'key2', 2.2), 'key3', 'value3')",
"duckdb": "SELECT STRUCT_INSERT(STRUCT_INSERT(STRUCT_INSERT({'key5': 'value5'}, key1 := 5), key2 := 2.2), key3 := 'value3')",
},
)
self.validate_all(
"SELECT OBJECT_INSERT(OBJECT_INSERT(OBJECT_INSERT(OBJECT_CONSTRUCT(), 'key1', 5), 'key2', 2.2), 'key3', 'value3')",
write={
"snowflake": "SELECT OBJECT_INSERT(OBJECT_INSERT(OBJECT_INSERT(OBJECT_CONSTRUCT(), 'key1', 5), 'key2', 2.2), 'key3', 'value3')",
"duckdb": "SELECT STRUCT_INSERT(STRUCT_INSERT(STRUCT_PACK(key1 := 5), key2 := 2.2), key3 := 'value3')",
},
)
def test_null_treatment(self): def test_null_treatment(self):
self.validate_all( self.validate_all(
r"SELECT FIRST_VALUE(TABLE1.COLUMN1) OVER (PARTITION BY RANDOM_COLUMN1, RANDOM_COLUMN2 ROWS BETWEEN UNBOUNDED PRECEDING AND UNBOUNDED FOLLOWING) AS MY_ALIAS FROM TABLE1", r"SELECT FIRST_VALUE(TABLE1.COLUMN1) OVER (PARTITION BY RANDOM_COLUMN1, RANDOM_COLUMN2 ROWS BETWEEN UNBOUNDED PRECEDING AND UNBOUNDED FOLLOWING) AS MY_ALIAS FROM TABLE1",
@ -899,6 +919,11 @@ WHERE
"SELECT * FROM @foo/bar (FILE_FORMAT => ds_sandbox.test.my_csv_format, PATTERN => 'test') AS bla", "SELECT * FROM @foo/bar (FILE_FORMAT => ds_sandbox.test.my_csv_format, PATTERN => 'test') AS bla",
) )
self.validate_identity(
"SELECT * FROM @test.public.thing/location/somefile.csv( FILE_FORMAT => 'fmt' )",
"SELECT * FROM @test.public.thing/location/somefile.csv (FILE_FORMAT => 'fmt')",
)
def test_sample(self): def test_sample(self):
self.validate_identity("SELECT * FROM testtable TABLESAMPLE BERNOULLI (20.3)") self.validate_identity("SELECT * FROM testtable TABLESAMPLE BERNOULLI (20.3)")
self.validate_identity("SELECT * FROM testtable TABLESAMPLE SYSTEM (3) SEED (82)") self.validate_identity("SELECT * FROM testtable TABLESAMPLE SYSTEM (3) SEED (82)")

View file

@ -391,6 +391,17 @@ class TestTSQL(Validator):
self.validate_identity("HASHBYTES('MD2', 'x')") self.validate_identity("HASHBYTES('MD2', 'x')")
self.validate_identity("LOG(n, b)") self.validate_identity("LOG(n, b)")
self.validate_all(
"STDEV(x)",
read={
"": "STDDEV(x)",
},
write={
"": "STDDEV(x)",
"tsql": "STDEV(x)",
},
)
def test_option(self): def test_option(self):
possible_options = [ possible_options = [
"HASH GROUP", "HASH GROUP",
@ -888,6 +899,14 @@ class TestTSQL(Validator):
}, },
) )
for colstore in ("NONCLUSTERED COLUMNSTORE", "CLUSTERED COLUMNSTORE"):
self.validate_identity(f"CREATE {colstore} INDEX index_name ON foo.bar")
self.validate_identity(
"CREATE COLUMNSTORE INDEX index_name ON foo.bar",
"CREATE NONCLUSTERED COLUMNSTORE INDEX index_name ON foo.bar",
)
def test_insert_cte(self): def test_insert_cte(self):
self.validate_all( self.validate_all(
"INSERT INTO foo.bar WITH cte AS (SELECT 1 AS one) SELECT * FROM cte", "INSERT INTO foo.bar WITH cte AS (SELECT 1 AS one) SELECT * FROM cte",

View file

@ -204,6 +204,7 @@ USE ROLE x
USE WAREHOUSE x USE WAREHOUSE x
USE DATABASE x USE DATABASE x
USE SCHEMA x.y USE SCHEMA x.y
USE CATALOG abc
NOT 1 NOT 1
NOT NOT 1 NOT NOT 1
SELECT * FROM test SELECT * FROM test
@ -870,3 +871,4 @@ SELECT unnest
SELECT * FROM a STRAIGHT_JOIN b SELECT * FROM a STRAIGHT_JOIN b
SELECT COUNT(DISTINCT "foo bar") FROM (SELECT 1 AS "foo bar") AS t SELECT COUNT(DISTINCT "foo bar") FROM (SELECT 1 AS "foo bar") AS t
SELECT vector SELECT vector
WITH all AS (SELECT 1 AS count) SELECT all.count FROM all

View file

@ -547,7 +547,8 @@ FROM (
"tb"."b" AS "b", "tb"."b" AS "b",
"tb"."c" AS "c" "tb"."c" AS "c"
FROM "sc"."tb" AS "tb" FROM "sc"."tb" AS "tb"
) AS "_q_0" PIVOT(SUM("_q_0"."c") FOR "_q_0"."b" IN ('x', 'y', 'z')) AS "_q_1"; ) AS "_q_0"
PIVOT(SUM("_q_0"."c") FOR "_q_0"."b" IN ('x', 'y', 'z')) AS "_q_1";
# title: pivoted source with explicit selections where one of them is excluded & selected at the same time # title: pivoted source with explicit selections where one of them is excluded & selected at the same time
# note: we need to respect the exclude when selecting * from pivoted source and not include the computed column twice # note: we need to respect the exclude when selecting * from pivoted source and not include the computed column twice
@ -564,7 +565,8 @@ FROM (
"tb"."b" AS "b", "tb"."b" AS "b",
"tb"."c" AS "c" "tb"."c" AS "c"
FROM "sc"."tb" AS "tb" FROM "sc"."tb" AS "tb"
) AS "_q_0" PIVOT(SUM("_q_0"."c") FOR "_q_0"."b" IN ('x', 'y', 'z')) AS "_q_1"; ) AS "_q_0"
PIVOT(SUM("_q_0"."c") FOR "_q_0"."b" IN ('x', 'y', 'z')) AS "_q_1";
# title: pivoted source with implicit selections # title: pivoted source with implicit selections
# execute: false # execute: false
@ -579,7 +581,8 @@ FROM (
"u"."g" AS "g", "u"."g" AS "g",
"u"."h" AS "h" "u"."h" AS "h"
FROM "u" AS "u" FROM "u" AS "u"
) AS "_q_0" PIVOT(SUM("_q_0"."f") FOR "_q_0"."h" IN ('x', 'y')) AS "_q_1"; ) AS "_q_0"
PIVOT(SUM("_q_0"."f") FOR "_q_0"."h" IN ('x', 'y')) AS "_q_1";
# title: selecting explicit qualified columns from pivoted source with explicit selections # title: selecting explicit qualified columns from pivoted source with explicit selections
# execute: false # execute: false
@ -592,7 +595,8 @@ FROM (
"u"."f" AS "f", "u"."f" AS "f",
"u"."h" AS "h" "u"."h" AS "h"
FROM "u" AS "u" FROM "u" AS "u"
) AS "_q_0" PIVOT(SUM("_q_0"."f") FOR "_q_0"."h" IN ('x', 'y')) AS "piv"; ) AS "_q_0"
PIVOT(SUM("_q_0"."f") FOR "_q_0"."h" IN ('x', 'y')) AS "piv";
# title: selecting explicit unqualified columns from pivoted source with implicit selections # title: selecting explicit unqualified columns from pivoted source with implicit selections
# execute: false # execute: false
@ -600,7 +604,8 @@ SELECT x, y FROM u PIVOT (SUM(f) FOR h IN ('x', 'y'));
SELECT SELECT
"_q_0"."x" AS "x", "_q_0"."x" AS "x",
"_q_0"."y" AS "y" "_q_0"."y" AS "y"
FROM "u" AS "u" PIVOT(SUM("u"."f") FOR "u"."h" IN ('x', 'y')) AS "_q_0"; FROM "u" AS "u"
PIVOT(SUM("u"."f") FOR "u"."h" IN ('x', 'y')) AS "_q_0";
# title: selecting all columns from a pivoted CTE source, using alias for the aggregation and generating bigquery # title: selecting all columns from a pivoted CTE source, using alias for the aggregation and generating bigquery
# execute: false # execute: false
@ -617,7 +622,8 @@ SELECT
`_q_0`.`g` AS `g`, `_q_0`.`g` AS `g`,
`_q_0`.`sum_x` AS `sum_x`, `_q_0`.`sum_x` AS `sum_x`,
`_q_0`.`sum_y` AS `sum_y` `_q_0`.`sum_y` AS `sum_y`
FROM `u_cte` AS `u_cte` PIVOT(SUM(`u_cte`.`f`) AS `sum` FOR `u_cte`.`h` IN ('x', 'y')) AS `_q_0`; FROM `u_cte` AS `u_cte`
PIVOT(SUM(`u_cte`.`f`) AS `sum` FOR `u_cte`.`h` IN ('x', 'y')) AS `_q_0`;
# title: selecting all columns from a pivoted source and generating snowflake # title: selecting all columns from a pivoted source and generating snowflake
# execute: false # execute: false
@ -627,7 +633,8 @@ SELECT
"_q_0"."G" AS "G", "_q_0"."G" AS "G",
"_q_0"."'x'" AS "'x'", "_q_0"."'x'" AS "'x'",
"_q_0"."'y'" AS "'y'" "_q_0"."'y'" AS "'y'"
FROM "U" AS "U" PIVOT(SUM("U"."F") FOR "U"."H" IN ('x', 'y')) AS "_q_0"; FROM "U" AS "U"
PIVOT(SUM("U"."F") FOR "U"."H" IN ('x', 'y')) AS "_q_0";
# title: selecting all columns from a pivoted source and generating spark # title: selecting all columns from a pivoted source and generating spark
# note: spark doesn't allow pivot aliases or qualified columns for the pivot's "field" (`h`) # note: spark doesn't allow pivot aliases or qualified columns for the pivot's "field" (`h`)
@ -641,7 +648,8 @@ SELECT
FROM ( FROM (
SELECT SELECT
* *
FROM `u` AS `u` PIVOT(SUM(`u`.`f`) FOR `h` IN ('x', 'y')) FROM `u` AS `u`
PIVOT(SUM(`u`.`f`) FOR `h` IN ('x', 'y'))
) AS `_q_0`; ) AS `_q_0`;
# title: selecting all columns from a pivoted source, pivot has column aliases # title: selecting all columns from a pivoted source, pivot has column aliases
@ -674,7 +682,8 @@ WITH "SOURCE" AS (
SELECT SELECT
"FINAL"."ID" AS "ID", "FINAL"."ID" AS "ID",
"FINAL"."TIMESTAMP_1" AS "TIMESTAMP_1" "FINAL"."TIMESTAMP_1" AS "TIMESTAMP_1"
FROM "SOURCE" AS "SOURCE" PIVOT(MAX("SOURCE"."VALUE") FOR "SOURCE"."KEY" IN ('a', 'b', 'c')) AS "FINAL"("ID", "TIMESTAMP_1", "TIMESTAMP_2", "COL_1", "COL_2", "COL_3"); FROM "SOURCE" AS "SOURCE"
PIVOT(MAX("SOURCE"."VALUE") FOR "SOURCE"."KEY" IN ('a', 'b', 'c')) AS "FINAL"("ID", "TIMESTAMP_1", "TIMESTAMP_2", "COL_1", "COL_2", "COL_3");
# title: unpivoted table source with a single value column, unpivot columns can't be qualified # title: unpivoted table source with a single value column, unpivot columns can't be qualified
# execute: false # execute: false
@ -685,7 +694,8 @@ SELECT
"_q_0"."DEPT" AS "DEPT", "_q_0"."DEPT" AS "DEPT",
"_q_0"."MONTH" AS "MONTH", "_q_0"."MONTH" AS "MONTH",
"_q_0"."SALES" AS "SALES" "_q_0"."SALES" AS "SALES"
FROM "M_SALES" AS "M_SALES"("EMPID", "DEPT", "JAN", "FEB") UNPIVOT("SALES" FOR "MONTH" IN ("JAN", "FEB")) AS "_q_0" FROM "M_SALES" AS "M_SALES"("EMPID", "DEPT", "JAN", "FEB")
UNPIVOT("SALES" FOR "MONTH" IN ("JAN", "FEB")) AS "_q_0"
ORDER BY ORDER BY
"_q_0"."EMPID"; "_q_0"."EMPID";
@ -704,7 +714,8 @@ FROM (
"m_sales"."jan" AS "jan", "m_sales"."jan" AS "jan",
"m_sales"."feb" AS "feb" "m_sales"."feb" AS "feb"
FROM "m_sales" AS "m_sales" FROM "m_sales" AS "m_sales"
) AS "m_sales" UNPIVOT("sales" FOR "month" IN ("m_sales"."jan", "m_sales"."feb")) AS "unpiv"("a", "b", "c", "d"); ) AS "m_sales"
UNPIVOT("sales" FOR "month" IN ("m_sales"."jan", "m_sales"."feb")) AS "unpiv"("a", "b", "c", "d");
# title: unpivoted derived table source with a single value column # title: unpivoted derived table source with a single value column
# execute: false # execute: false
@ -722,20 +733,22 @@ FROM (
"M_SALES"."JAN" AS "JAN", "M_SALES"."JAN" AS "JAN",
"M_SALES"."FEB" AS "FEB" "M_SALES"."FEB" AS "FEB"
FROM "M_SALES" AS "M_SALES" FROM "M_SALES" AS "M_SALES"
) AS "M_SALES" UNPIVOT("SALES" FOR "MONTH" IN ("JAN", "FEB")) AS "_q_0" ) AS "M_SALES"
UNPIVOT("SALES" FOR "MONTH" IN ("JAN", "FEB")) AS "_q_0"
ORDER BY ORDER BY
"_q_0"."EMPID"; "_q_0"."EMPID";
# title: unpivoted table source with a single value column, unpivot columns can be qualified # title: unpivoted table source with a single value column, unpivot columns can be qualified
# execute: false # execute: false
# dialect: bigquery # dialect: bigquery
# note: the named columns aren't supported by BQ but we add them here to avoid defining a schema # note: the named columns aren not supported by BQ but we add them here to avoid defining a schema
SELECT * FROM produce AS produce(product, q1, q2, q3, q4) UNPIVOT(sales FOR quarter IN (q1, q2, q3, q4)); SELECT * FROM produce AS produce(product, q1, q2, q3, q4) UNPIVOT(sales FOR quarter IN (q1, q2, q3, q4));
SELECT SELECT
`_q_0`.`product` AS `product`, `_q_0`.`product` AS `product`,
`_q_0`.`quarter` AS `quarter`, `_q_0`.`quarter` AS `quarter`,
`_q_0`.`sales` AS `sales` `_q_0`.`sales` AS `sales`
FROM `produce` AS `produce` UNPIVOT(`sales` FOR `quarter` IN (`produce`.`q1`, `produce`.`q2`, `produce`.`q3`, `produce`.`q4`)) AS `_q_0`; FROM `produce` AS `produce`
UNPIVOT(`sales` FOR `quarter` IN (`produce`.`q1`, `produce`.`q2`, `produce`.`q3`, `produce`.`q4`)) AS `_q_0`;
# title: unpivoted table source with multiple value columns # title: unpivoted table source with multiple value columns
# execute: false # execute: false
@ -746,7 +759,8 @@ SELECT
`_q_0`.`semesters` AS `semesters`, `_q_0`.`semesters` AS `semesters`,
`_q_0`.`first_half_sales` AS `first_half_sales`, `_q_0`.`first_half_sales` AS `first_half_sales`,
`_q_0`.`second_half_sales` AS `second_half_sales` `_q_0`.`second_half_sales` AS `second_half_sales`
FROM `produce` AS `produce` UNPIVOT((`first_half_sales`, `second_half_sales`) FOR `semesters` IN ((`produce`.`q1`, `produce`.`q2`) AS 'semester_1', (`produce`.`q3`, `produce`.`q4`) AS 'semester_2')) AS `_q_0`; FROM `produce` AS `produce`
UNPIVOT((`first_half_sales`, `second_half_sales`) FOR `semesters` IN ((`produce`.`q1`, `produce`.`q2`) AS 'semester_1', (`produce`.`q3`, `produce`.`q4`) AS 'semester_2')) AS `_q_0`;
# title: quoting is preserved # title: quoting is preserved
# dialect: snowflake # dialect: snowflake

Some files were not shown because too many files have changed in this diff Show more