Adding upstream version 26.16.2.
Signed-off-by: Daniel Baumann <daniel@debian.org>
This commit is contained in:
parent
4bfa0e7e53
commit
6e767a6f98
110 changed files with 62370 additions and 61414 deletions
98
CHANGELOG.md
98
CHANGELOG.md
|
@ -1,6 +1,101 @@
|
|||
Changelog
|
||||
=========
|
||||
|
||||
## [v26.16.1] - 2025-04-24
|
||||
### :sparkles: New Features
|
||||
- [`27a9fb2`](https://github.com/tobymao/sqlglot/commit/27a9fb26a1936512a09b8b09ed2656e22918f2c6) - **clickhouse**: Support parsing CTAS with alias *(PR [#5003](https://github.com/tobymao/sqlglot/pull/5003) by [@dorranh](https://github.com/dorranh))*
|
||||
- [`45cd165`](https://github.com/tobymao/sqlglot/commit/45cd165eaca96b33f1de753a147bdc352b9d56d0) - **clickhouse**: Support ClickHouse Nothing type *(PR [#5004](https://github.com/tobymao/sqlglot/pull/5004) by [@dorranh](https://github.com/dorranh))*
|
||||
- [`ca61a61`](https://github.com/tobymao/sqlglot/commit/ca61a617fa67082bc0fc94853dee4d70b8ca5c59) - Support exp.PartitionByProperty for parse_into() *(PR [#5006](https://github.com/tobymao/sqlglot/pull/5006) by [@erindru](https://github.com/erindru))*
|
||||
- [`a6d4c3c`](https://github.com/tobymao/sqlglot/commit/a6d4c3c901f828cdd96a16a0e55eac1b244f63be) - **snowflake**: Add numeric parameter support *(PR [#5008](https://github.com/tobymao/sqlglot/pull/5008) by [@hovaesco](https://github.com/hovaesco))*
|
||||
|
||||
### :bug: Bug Fixes
|
||||
- [`8e9dbd4`](https://github.com/tobymao/sqlglot/commit/8e9dbd491b9516c614554e05f05cc1cb976838e3) - **duckdb**: warn on unsupported IGNORE/RESPECT NULLS *(PR [#5002](https://github.com/tobymao/sqlglot/pull/5002) by [@georgesittas](https://github.com/georgesittas))*
|
||||
- :arrow_lower_right: *fixes issue [#5001](https://github.com/tobymao/sqlglot/issues/5001) opened by [@MarcoGorelli](https://github.com/MarcoGorelli)*
|
||||
- [`10b02bc`](https://github.com/tobymao/sqlglot/commit/10b02bce304042fea09e9cb2369db3c873452245) - **clickhouse**: Support optional timezone argument in date_diff() *(PR [#5005](https://github.com/tobymao/sqlglot/pull/5005) by [@dorranh](https://github.com/dorranh))*
|
||||
|
||||
### :wrench: Chores
|
||||
- [`1d4d906`](https://github.com/tobymao/sqlglot/commit/1d4d906abc60d29b6606bc8eee50c92cef21d3fd) - use _try_parse for parsing ClickHouse's CREATE TABLE .. AS <table> *(commit by [@georgesittas](https://github.com/georgesittas))*
|
||||
- [`fc58c27`](https://github.com/tobymao/sqlglot/commit/fc58c273690734263b971b138ec8f0186f524672) - Refactor placeholder parsing for TokenType.COLON *(PR [#5009](https://github.com/tobymao/sqlglot/pull/5009) by [@VaggelisD](https://github.com/VaggelisD))*
|
||||
|
||||
|
||||
## [v26.16.0] - 2025-04-22
|
||||
### :boom: BREAKING CHANGES
|
||||
- due to [`510984f`](https://github.com/tobymao/sqlglot/commit/510984f2ddc6ff13b8a8030f698aed9ad0e6f46b) - stop generating redundant TO_DATE calls *(PR [#4990](https://github.com/tobymao/sqlglot/pull/4990) by [@georgesittas](https://github.com/georgesittas))*:
|
||||
|
||||
stop generating redundant TO_DATE calls (#4990)
|
||||
|
||||
- due to [`da9ec61`](https://github.com/tobymao/sqlglot/commit/da9ec61e8edd5049e246390e1b638cf14d50fa2d) - Fix pretty generation of exp.Window *(PR [#4994](https://github.com/tobymao/sqlglot/pull/4994) by [@VaggelisD](https://github.com/VaggelisD))*:
|
||||
|
||||
Fix pretty generation of exp.Window (#4994)
|
||||
|
||||
- due to [`fb83fac`](https://github.com/tobymao/sqlglot/commit/fb83fac2d097d8d3e8e2556c072792857609bd94) - remove recursion from `simplify` *(PR [#4988](https://github.com/tobymao/sqlglot/pull/4988) by [@georgesittas](https://github.com/georgesittas))*:
|
||||
|
||||
remove recursion from `simplify` (#4988)
|
||||
|
||||
- due to [`890b24a`](https://github.com/tobymao/sqlglot/commit/890b24a5cec269f5595743d0a86024a23217a3f1) - remove `connector_depth` as it is now dead code *(commit by [@georgesittas](https://github.com/georgesittas))*:
|
||||
|
||||
remove `connector_depth` as it is now dead code
|
||||
|
||||
- due to [`1dc501b`](https://github.com/tobymao/sqlglot/commit/1dc501b8ed68638375d869e11f3bf188948a4990) - remove `max_depth` argument in simplify as it is now dead code *(commit by [@georgesittas](https://github.com/georgesittas))*:
|
||||
|
||||
remove `max_depth` argument in simplify as it is now dead code
|
||||
|
||||
|
||||
### :sparkles: New Features
|
||||
- [`76535ce`](https://github.com/tobymao/sqlglot/commit/76535ce9487186d2eb7071fac2f224238de7a9ba) - **optimizer**: add support for Spark's TRANSFORM clause *(PR [#4993](https://github.com/tobymao/sqlglot/pull/4993) by [@georgesittas](https://github.com/georgesittas))*
|
||||
- :arrow_lower_right: *addresses issue [#4991](https://github.com/tobymao/sqlglot/issues/4991) opened by [@karta0807913](https://github.com/karta0807913)*
|
||||
|
||||
### :bug: Bug Fixes
|
||||
- [`510984f`](https://github.com/tobymao/sqlglot/commit/510984f2ddc6ff13b8a8030f698aed9ad0e6f46b) - **hive**: stop generating redundant TO_DATE calls *(PR [#4990](https://github.com/tobymao/sqlglot/pull/4990) by [@georgesittas](https://github.com/georgesittas))*
|
||||
- [`da9ec61`](https://github.com/tobymao/sqlglot/commit/da9ec61e8edd5049e246390e1b638cf14d50fa2d) - **generator**: Fix pretty generation of exp.Window *(PR [#4994](https://github.com/tobymao/sqlglot/pull/4994) by [@VaggelisD](https://github.com/VaggelisD))*
|
||||
- :arrow_lower_right: *fixes issue [#4098](https://github.com/TobikoData/sqlmesh/issues/4098) opened by [@tanghyd](https://github.com/tanghyd)*
|
||||
- [`aae9aa8`](https://github.com/tobymao/sqlglot/commit/aae9aa8f96ccaa7686cda3cdabec208ae4c3d60a) - **optimizer**: ensure there are no shared refs after qualify_tables *(PR [#4995](https://github.com/tobymao/sqlglot/pull/4995) by [@georgesittas](https://github.com/georgesittas))*
|
||||
- [`adaef42`](https://github.com/tobymao/sqlglot/commit/adaef42234d8f1c9c331f53bee2c42686f29bdec) - **trino**: Dont quote identifiers in string literals for the partitioned_by property *(PR [#4998](https://github.com/tobymao/sqlglot/pull/4998) by [@erindru](https://github.com/erindru))*
|
||||
- [`a547f8d`](https://github.com/tobymao/sqlglot/commit/a547f8d4292f3b3a4c85f9d6466ead2ad976dfd2) - **postgres**: Capture optional minus sign in interval regex *(PR [#5000](https://github.com/tobymao/sqlglot/pull/5000) by [@VaggelisD](https://github.com/VaggelisD))*
|
||||
- :arrow_lower_right: *fixes issue [#4999](https://github.com/tobymao/sqlglot/issues/4999) opened by [@cpimhoff](https://github.com/cpimhoff)*
|
||||
|
||||
### :recycle: Refactors
|
||||
- [`fb83fac`](https://github.com/tobymao/sqlglot/commit/fb83fac2d097d8d3e8e2556c072792857609bd94) - **optimizer**: remove recursion from `simplify` *(PR [#4988](https://github.com/tobymao/sqlglot/pull/4988) by [@georgesittas](https://github.com/georgesittas))*
|
||||
|
||||
### :wrench: Chores
|
||||
- [`890b24a`](https://github.com/tobymao/sqlglot/commit/890b24a5cec269f5595743d0a86024a23217a3f1) - remove `connector_depth` as it is now dead code *(commit by [@georgesittas](https://github.com/georgesittas))*
|
||||
- [`1dc501b`](https://github.com/tobymao/sqlglot/commit/1dc501b8ed68638375d869e11f3bf188948a4990) - remove `max_depth` argument in simplify as it is now dead code *(commit by [@georgesittas](https://github.com/georgesittas))*
|
||||
- [`6572517`](https://github.com/tobymao/sqlglot/commit/6572517c1ec76f14cbd661aacc15c84bef065284) - improve tooling around benchmarks *(commit by [@georgesittas](https://github.com/georgesittas))*
|
||||
|
||||
|
||||
## [v26.15.0] - 2025-04-17
|
||||
### :boom: BREAKING CHANGES
|
||||
- due to [`2b7845a`](https://github.com/tobymao/sqlglot/commit/2b7845a3a821d366ae90ba9ef5e7d61194a34874) - Add support for Athena's Iceberg partitioning transforms *(PR [#4976](https://github.com/tobymao/sqlglot/pull/4976) by [@VaggelisD](https://github.com/VaggelisD))*:
|
||||
|
||||
Add support for Athena's Iceberg partitioning transforms (#4976)
|
||||
|
||||
- due to [`ee794e9`](https://github.com/tobymao/sqlglot/commit/ee794e9c6a3b2fdb142114327d904b6c94a16cd0) - use the standard POWER function instead of ^ fixes [#4982](https://github.com/tobymao/sqlglot/pull/4982) *(commit by [@georgesittas](https://github.com/georgesittas))*:
|
||||
|
||||
use the standard POWER function instead of ^ fixes #4982
|
||||
|
||||
- due to [`2369195`](https://github.com/tobymao/sqlglot/commit/2369195635e25dabd5ce26c13e402076508bba04) - consistently parse INTERVAL value as a string *(PR [#4986](https://github.com/tobymao/sqlglot/pull/4986) by [@georgesittas](https://github.com/georgesittas))*:
|
||||
|
||||
consistently parse INTERVAL value as a string (#4986)
|
||||
|
||||
- due to [`e866cff`](https://github.com/tobymao/sqlglot/commit/e866cffbaac3b62255d0d5c8be043ab2394af619) - support RELY option for PRIMARY KEY, FOREIGN KEY, and UNIQUE constraints *(PR [#4987](https://github.com/tobymao/sqlglot/pull/4987) by [@geooo109](https://github.com/geooo109))*:
|
||||
|
||||
support RELY option for PRIMARY KEY, FOREIGN KEY, and UNIQUE constraints (#4987)
|
||||
|
||||
|
||||
### :sparkles: New Features
|
||||
- [`e866cff`](https://github.com/tobymao/sqlglot/commit/e866cffbaac3b62255d0d5c8be043ab2394af619) - **parser**: support RELY option for PRIMARY KEY, FOREIGN KEY, and UNIQUE constraints *(PR [#4987](https://github.com/tobymao/sqlglot/pull/4987) by [@geooo109](https://github.com/geooo109))*
|
||||
- :arrow_lower_right: *addresses issue [#4983](https://github.com/tobymao/sqlglot/issues/4983) opened by [@ggadon](https://github.com/ggadon)*
|
||||
|
||||
### :bug: Bug Fixes
|
||||
- [`2b7845a`](https://github.com/tobymao/sqlglot/commit/2b7845a3a821d366ae90ba9ef5e7d61194a34874) - Add support for Athena's Iceberg partitioning transforms *(PR [#4976](https://github.com/tobymao/sqlglot/pull/4976) by [@VaggelisD](https://github.com/VaggelisD))*
|
||||
- [`fa6af23`](https://github.com/tobymao/sqlglot/commit/fa6af2302f8482c5d89ead481afe4195aaa41a9c) - **optimizer**: compare the whole type to determine if a cast can be removed *(PR [#4981](https://github.com/tobymao/sqlglot/pull/4981) by [@georgesittas](https://github.com/georgesittas))*
|
||||
- :arrow_lower_right: *fixes issue [#4977](https://github.com/tobymao/sqlglot/issues/4977) opened by [@MeinAccount](https://github.com/MeinAccount)*
|
||||
- [`830c9b8`](https://github.com/tobymao/sqlglot/commit/830c9b8bbf906cf5d4fa8028b67dadda73fc58a9) - **unnest_subqueries**: avoid adding GROUP BY on aggregate projections in lateral subqueries *(PR [#4970](https://github.com/tobymao/sqlglot/pull/4970) by [@skadel](https://github.com/skadel))*
|
||||
- [`ee794e9`](https://github.com/tobymao/sqlglot/commit/ee794e9c6a3b2fdb142114327d904b6c94a16cd0) - **postgres**: use the standard POWER function instead of ^ fixes [#4982](https://github.com/tobymao/sqlglot/pull/4982) *(commit by [@georgesittas](https://github.com/georgesittas))*
|
||||
- [`85e62b8`](https://github.com/tobymao/sqlglot/commit/85e62b88df2822797f527dce4eaa230c778cbe9e) - **bigquery**: Do not consume JOIN keywords after WITH OFFSET *(PR [#4984](https://github.com/tobymao/sqlglot/pull/4984) by [@VaggelisD](https://github.com/VaggelisD))*
|
||||
- [`2369195`](https://github.com/tobymao/sqlglot/commit/2369195635e25dabd5ce26c13e402076508bba04) - consistently parse INTERVAL value as a string *(PR [#4986](https://github.com/tobymao/sqlglot/pull/4986) by [@georgesittas](https://github.com/georgesittas))*
|
||||
|
||||
|
||||
## [v26.14.0] - 2025-04-15
|
||||
### :boom: BREAKING CHANGES
|
||||
- due to [`cb20038`](https://github.com/tobymao/sqlglot/commit/cb2003875fc6e149bd4a631e99c312a04435a46b) - treat GO as command *(PR [#4978](https://github.com/tobymao/sqlglot/pull/4978) by [@georgesittas](https://github.com/georgesittas))*:
|
||||
|
@ -6358,3 +6453,6 @@ Changelog
|
|||
[v26.13.1]: https://github.com/tobymao/sqlglot/compare/v26.13.0...v26.13.1
|
||||
[v26.13.2]: https://github.com/tobymao/sqlglot/compare/v26.13.1...v26.13.2
|
||||
[v26.14.0]: https://github.com/tobymao/sqlglot/compare/v26.13.2...v26.14.0
|
||||
[v26.15.0]: https://github.com/tobymao/sqlglot/compare/v26.14.0...v26.15.0
|
||||
[v26.16.0]: https://github.com/tobymao/sqlglot/compare/v26.15.0...v26.16.0
|
||||
[v26.16.1]: https://github.com/tobymao/sqlglot/compare/v26.16.0...v26.16.1
|
||||
|
|
5
Makefile
5
Makefile
|
@ -4,7 +4,10 @@ install:
|
|||
pip install -e .
|
||||
|
||||
bench: install-dev-rs-release
|
||||
python benchmarks/bench.py
|
||||
python -m benchmarks.bench
|
||||
|
||||
bench-optimize: install-dev-rs-release
|
||||
python -m benchmarks.optimize
|
||||
|
||||
install-dev-rs-release:
|
||||
cd sqlglotrs/ && python -m maturin develop -r
|
||||
|
|
|
@ -533,6 +533,10 @@ make check # Full test suite & linter checks
|
|||
| long | 0.00889 (1.0) | 0.00572 (0.643) | 0.36982 (41.56) | 0.00614 (0.690) | 0.02530 (2.844) | 0.02931 (3.294) | 0.00059 (0.066) |
|
||||
| crazy | 0.02918 (1.0) | 0.01991 (0.682) | 1.88695 (64.66) | 0.02003 (0.686) | 7.46894 (255.9) | 0.64994 (22.27) | 0.00327 (0.112) |
|
||||
|
||||
```
|
||||
make bench # Run parsing benchmark
|
||||
make bench-optimize # Run optimization benchmark
|
||||
```
|
||||
|
||||
## Optional Dependencies
|
||||
|
||||
|
|
|
@ -1,6 +1,6 @@
|
|||
import collections.abc
|
||||
|
||||
from helpers import ascii_table
|
||||
from benchmarks.helpers import ascii_table
|
||||
|
||||
# moz_sql_parser 3.10 compatibility
|
||||
collections.Iterable = collections.abc.Iterable
|
||||
|
|
|
@ -1,12 +1,12 @@
|
|||
import sys
|
||||
import typing as t
|
||||
from argparse import ArgumentParser
|
||||
|
||||
from helpers import ascii_table
|
||||
from benchmarks.helpers import ascii_table
|
||||
from sqlglot.optimizer import optimize
|
||||
from sqlglot import parse_one
|
||||
from tests.helpers import load_sql_fixture_pairs, TPCH_SCHEMA, TPCDS_SCHEMA
|
||||
from timeit import Timer
|
||||
import sys
|
||||
|
||||
# Deeply nested conditions currently require a lot of recursion
|
||||
sys.setrecursionlimit(10000)
|
||||
|
|
File diff suppressed because one or more lines are too long
File diff suppressed because one or more lines are too long
File diff suppressed because one or more lines are too long
File diff suppressed because one or more lines are too long
File diff suppressed because one or more lines are too long
File diff suppressed because one or more lines are too long
File diff suppressed because one or more lines are too long
File diff suppressed because one or more lines are too long
File diff suppressed because one or more lines are too long
File diff suppressed because one or more lines are too long
File diff suppressed because one or more lines are too long
File diff suppressed because one or more lines are too long
File diff suppressed because one or more lines are too long
File diff suppressed because one or more lines are too long
File diff suppressed because one or more lines are too long
File diff suppressed because one or more lines are too long
File diff suppressed because one or more lines are too long
File diff suppressed because one or more lines are too long
File diff suppressed because one or more lines are too long
File diff suppressed because one or more lines are too long
File diff suppressed because one or more lines are too long
File diff suppressed because one or more lines are too long
File diff suppressed because one or more lines are too long
File diff suppressed because one or more lines are too long
File diff suppressed because one or more lines are too long
File diff suppressed because one or more lines are too long
File diff suppressed because one or more lines are too long
File diff suppressed because one or more lines are too long
File diff suppressed because one or more lines are too long
File diff suppressed because one or more lines are too long
File diff suppressed because one or more lines are too long
File diff suppressed because one or more lines are too long
File diff suppressed because one or more lines are too long
File diff suppressed because one or more lines are too long
File diff suppressed because one or more lines are too long
File diff suppressed because one or more lines are too long
File diff suppressed because one or more lines are too long
File diff suppressed because one or more lines are too long
File diff suppressed because one or more lines are too long
File diff suppressed because one or more lines are too long
File diff suppressed because one or more lines are too long
File diff suppressed because one or more lines are too long
File diff suppressed because one or more lines are too long
File diff suppressed because one or more lines are too long
File diff suppressed because one or more lines are too long
File diff suppressed because one or more lines are too long
File diff suppressed because one or more lines are too long
File diff suppressed because one or more lines are too long
File diff suppressed because one or more lines are too long
File diff suppressed because one or more lines are too long
File diff suppressed because one or more lines are too long
File diff suppressed because one or more lines are too long
File diff suppressed because one or more lines are too long
File diff suppressed because one or more lines are too long
File diff suppressed because one or more lines are too long
File diff suppressed because one or more lines are too long
File diff suppressed because one or more lines are too long
File diff suppressed because one or more lines are too long
File diff suppressed because one or more lines are too long
File diff suppressed because one or more lines are too long
File diff suppressed because one or more lines are too long
File diff suppressed because one or more lines are too long
File diff suppressed because one or more lines are too long
File diff suppressed because one or more lines are too long
30655
docs/sqlglot/parser.html
30655
docs/sqlglot/parser.html
File diff suppressed because one or more lines are too long
File diff suppressed because one or more lines are too long
File diff suppressed because one or more lines are too long
File diff suppressed because one or more lines are too long
File diff suppressed because one or more lines are too long
File diff suppressed because one or more lines are too long
File diff suppressed because one or more lines are too long
File diff suppressed because one or more lines are too long
|
@ -62,6 +62,7 @@ dialect implementations in order to understand how their various components can
|
|||
"""
|
||||
|
||||
import importlib
|
||||
import threading
|
||||
|
||||
DIALECTS = [
|
||||
"Athena",
|
||||
|
@ -104,11 +105,14 @@ MODULE_BY_ATTRIBUTE = {
|
|||
|
||||
__all__ = list(MODULE_BY_ATTRIBUTE)
|
||||
|
||||
_import_lock = threading.Lock()
|
||||
|
||||
|
||||
def __getattr__(name):
|
||||
module_name = MODULE_BY_ATTRIBUTE.get(name)
|
||||
if module_name:
|
||||
module = importlib.import_module(f"sqlglot.dialects.{module_name}")
|
||||
with _import_lock:
|
||||
module = importlib.import_module(f"sqlglot.dialects.{module_name}")
|
||||
return getattr(module, name)
|
||||
|
||||
raise AttributeError(f"module {__name__} has no attribute {name}")
|
||||
|
|
|
@ -103,6 +103,7 @@ def _datetime_delta_sql(name: str) -> t.Callable[[Generator, DATEΤΙΜΕ_DELTA]
|
|||
unit_to_var(expression),
|
||||
expression.expression,
|
||||
expression.this,
|
||||
expression.args.get("zone"),
|
||||
)
|
||||
|
||||
return _delta_sql
|
||||
|
@ -260,6 +261,7 @@ class ClickHouse(Dialect):
|
|||
"LOWCARDINALITY": TokenType.LOWCARDINALITY,
|
||||
"MAP": TokenType.MAP,
|
||||
"NESTED": TokenType.NESTED,
|
||||
"NOTHING": TokenType.NOTHING,
|
||||
"SAMPLE": TokenType.TABLE_SAMPLE,
|
||||
"TUPLE": TokenType.STRUCT,
|
||||
"UINT16": TokenType.USMALLINT,
|
||||
|
@ -301,8 +303,8 @@ class ClickHouse(Dialect):
|
|||
"COUNTIF": _build_count_if,
|
||||
"DATE_ADD": build_date_delta(exp.DateAdd, default_unit=None),
|
||||
"DATEADD": build_date_delta(exp.DateAdd, default_unit=None),
|
||||
"DATE_DIFF": build_date_delta(exp.DateDiff, default_unit=None),
|
||||
"DATEDIFF": build_date_delta(exp.DateDiff, default_unit=None),
|
||||
"DATE_DIFF": build_date_delta(exp.DateDiff, default_unit=None, supports_timezone=True),
|
||||
"DATEDIFF": build_date_delta(exp.DateDiff, default_unit=None, supports_timezone=True),
|
||||
"DATE_FORMAT": _build_date_format,
|
||||
"DATE_SUB": build_date_delta(exp.DateSub, default_unit=None),
|
||||
"DATESUB": build_date_delta(exp.DateSub, default_unit=None),
|
||||
|
@ -1018,6 +1020,7 @@ class ClickHouse(Dialect):
|
|||
exp.DataType.Type.LOWCARDINALITY: "LowCardinality",
|
||||
exp.DataType.Type.MAP: "Map",
|
||||
exp.DataType.Type.NESTED: "Nested",
|
||||
exp.DataType.Type.NOTHING: "Nothing",
|
||||
exp.DataType.Type.SMALLINT: "Int16",
|
||||
exp.DataType.Type.STRUCT: "Tuple",
|
||||
exp.DataType.Type.TINYINT: "Int8",
|
||||
|
|
|
@ -27,6 +27,12 @@ class Databricks(Spark):
|
|||
class JSONPathTokenizer(jsonpath.JSONPathTokenizer):
|
||||
IDENTIFIERS = ["`", '"']
|
||||
|
||||
class Tokenizer(Spark.Tokenizer):
|
||||
KEYWORDS = {
|
||||
**Spark.Tokenizer.KEYWORDS,
|
||||
"VOID": TokenType.VOID,
|
||||
}
|
||||
|
||||
class Parser(Spark.Parser):
|
||||
LOG_DEFAULTS_TO_LN = True
|
||||
STRICT_CAST = True
|
||||
|
@ -83,6 +89,11 @@ class Databricks(Spark):
|
|||
|
||||
TRANSFORMS.pop(exp.TryCast)
|
||||
|
||||
TYPE_MAPPING = {
|
||||
**Spark.Generator.TYPE_MAPPING,
|
||||
exp.DataType.Type.NULL: "VOID",
|
||||
}
|
||||
|
||||
def columndef_sql(self, expression: exp.ColumnDef, sep: str = " ") -> str:
|
||||
constraint = expression.find(exp.GeneratedAsIdentityColumnConstraint)
|
||||
kind = expression.kind
|
||||
|
|
|
@ -1238,15 +1238,20 @@ def build_date_delta(
|
|||
exp_class: t.Type[E],
|
||||
unit_mapping: t.Optional[t.Dict[str, str]] = None,
|
||||
default_unit: t.Optional[str] = "DAY",
|
||||
supports_timezone: bool = False,
|
||||
) -> t.Callable[[t.List], E]:
|
||||
def _builder(args: t.List) -> E:
|
||||
unit_based = len(args) == 3
|
||||
unit_based = len(args) >= 3
|
||||
has_timezone = len(args) == 4
|
||||
this = args[2] if unit_based else seq_get(args, 0)
|
||||
unit = None
|
||||
if unit_based or default_unit:
|
||||
unit = args[0] if unit_based else exp.Literal.string(default_unit)
|
||||
unit = exp.var(unit_mapping.get(unit.name.lower(), unit.name)) if unit_mapping else unit
|
||||
return exp_class(this=this, expression=seq_get(args, 1), unit=unit)
|
||||
expression = exp_class(this=this, expression=seq_get(args, 1), unit=unit)
|
||||
if supports_timezone and has_timezone:
|
||||
expression.set("zone", args[-1])
|
||||
return expression
|
||||
|
||||
return _builder
|
||||
|
||||
|
|
|
@ -47,14 +47,6 @@ DATETIME_DELTA = t.Union[
|
|||
exp.DateAdd, exp.TimeAdd, exp.DatetimeAdd, exp.TsOrDsAdd, exp.DateSub, exp.DatetimeSub
|
||||
]
|
||||
|
||||
WINDOW_FUNCS_WITH_IGNORE_NULLS = (
|
||||
exp.FirstValue,
|
||||
exp.LastValue,
|
||||
exp.Lag,
|
||||
exp.Lead,
|
||||
exp.NthValue,
|
||||
)
|
||||
|
||||
|
||||
def _date_delta_sql(self: DuckDB.Generator, expression: DATETIME_DELTA) -> str:
|
||||
this = expression.this
|
||||
|
@ -879,6 +871,14 @@ class DuckDB(Dialect):
|
|||
PROPERTIES_LOCATION[exp.TemporaryProperty] = exp.Properties.Location.POST_CREATE
|
||||
PROPERTIES_LOCATION[exp.ReturnsProperty] = exp.Properties.Location.POST_ALIAS
|
||||
|
||||
IGNORE_RESPECT_NULLS_WINDOW_FUNCTIONS = (
|
||||
exp.FirstValue,
|
||||
exp.Lag,
|
||||
exp.LastValue,
|
||||
exp.Lead,
|
||||
exp.NthValue,
|
||||
)
|
||||
|
||||
def show_sql(self, expression: exp.Show) -> str:
|
||||
return f"SHOW {expression.name}"
|
||||
|
||||
|
@ -1098,11 +1098,21 @@ class DuckDB(Dialect):
|
|||
return super().unnest_sql(expression)
|
||||
|
||||
def ignorenulls_sql(self, expression: exp.IgnoreNulls) -> str:
|
||||
if isinstance(expression.this, WINDOW_FUNCS_WITH_IGNORE_NULLS):
|
||||
if isinstance(expression.this, self.IGNORE_RESPECT_NULLS_WINDOW_FUNCTIONS):
|
||||
# DuckDB should render IGNORE NULLS only for the general-purpose
|
||||
# window functions that accept it e.g. FIRST_VALUE(... IGNORE NULLS) OVER (...)
|
||||
return super().ignorenulls_sql(expression)
|
||||
|
||||
self.unsupported("IGNORE NULLS is not supported for non-window functions.")
|
||||
return self.sql(expression, "this")
|
||||
|
||||
def respectnulls_sql(self, expression: exp.RespectNulls) -> str:
|
||||
if isinstance(expression.this, self.IGNORE_RESPECT_NULLS_WINDOW_FUNCTIONS):
|
||||
# DuckDB should render RESPECT NULLS only for the general-purpose
|
||||
# window functions that accept it e.g. FIRST_VALUE(... RESPECT NULLS) OVER (...)
|
||||
return super().respectnulls_sql(expression)
|
||||
|
||||
self.unsupported("RESPECT NULLS is not supported for non-window functions.")
|
||||
return self.sql(expression, "this")
|
||||
|
||||
def arraytostring_sql(self, expression: exp.ArrayToString) -> str:
|
||||
|
|
|
@ -62,6 +62,13 @@ TIME_DIFF_FACTOR = {
|
|||
|
||||
DIFF_MONTH_SWITCH = ("YEAR", "QUARTER", "MONTH")
|
||||
|
||||
TS_OR_DS_EXPRESSIONS = (
|
||||
exp.DateDiff,
|
||||
exp.Day,
|
||||
exp.Month,
|
||||
exp.Year,
|
||||
)
|
||||
|
||||
|
||||
def _add_date_sql(self: Hive.Generator, expression: DATE_ADD_OR_SUB) -> str:
|
||||
if isinstance(expression, exp.TsOrDsAdd) and not expression.unit:
|
||||
|
@ -167,7 +174,7 @@ def _to_date_sql(self: Hive.Generator, expression: exp.TsOrDsToDate) -> str:
|
|||
if time_format and time_format not in (Hive.TIME_FORMAT, Hive.DATE_FORMAT):
|
||||
return self.func("TO_DATE", expression.this, time_format)
|
||||
|
||||
if isinstance(expression.this, exp.TsOrDsToDate):
|
||||
if isinstance(expression.parent, TS_OR_DS_EXPRESSIONS):
|
||||
return self.sql(expression, "this")
|
||||
|
||||
return self.func("TO_DATE", expression.this)
|
||||
|
|
|
@ -57,11 +57,13 @@ def _no_sort_array(self: Presto.Generator, expression: exp.SortArray) -> str:
|
|||
|
||||
def _schema_sql(self: Presto.Generator, expression: exp.Schema) -> str:
|
||||
if isinstance(expression.parent, exp.PartitionedByProperty):
|
||||
# Any columns in the ARRAY[] string literals should not be quoted
|
||||
expression.transform(lambda n: n.name if isinstance(n, exp.Identifier) else n, copy=False)
|
||||
|
||||
partition_exprs = [
|
||||
self.sql(c) if isinstance(c, (exp.Func, exp.Property)) else self.sql(c, "this")
|
||||
for c in expression.expressions
|
||||
]
|
||||
|
||||
return self.sql(exp.Array(expressions=[exp.Literal.string(c) for c in partition_exprs]))
|
||||
|
||||
if expression.parent:
|
||||
|
|
|
@ -401,6 +401,8 @@ class Snowflake(Dialect):
|
|||
TABLE_ALIAS_TOKENS = parser.Parser.TABLE_ALIAS_TOKENS | {TokenType.WINDOW}
|
||||
TABLE_ALIAS_TOKENS.discard(TokenType.MATCH_CONDITION)
|
||||
|
||||
COLON_PLACEHOLDER_TOKENS = ID_VAR_TOKENS | {TokenType.NUMBER}
|
||||
|
||||
FUNCTIONS = {
|
||||
**parser.Parser.FUNCTIONS,
|
||||
"APPROX_PERCENTILE": exp.ApproxQuantile.from_arg_list,
|
||||
|
|
|
@ -9,7 +9,7 @@ from sqlglot.errors import ExecuteError
|
|||
from sqlglot.executor.context import Context
|
||||
from sqlglot.executor.env import ENV
|
||||
from sqlglot.executor.table import RowReader, Table
|
||||
from sqlglot.helper import csv_reader, ensure_list, subclasses
|
||||
from sqlglot.helper import csv_reader, subclasses
|
||||
|
||||
|
||||
class PythonExecutor:
|
||||
|
@ -370,8 +370,8 @@ def _rename(self, e):
|
|||
return self.func(e.key, *values)
|
||||
|
||||
if isinstance(e, exp.Func) and e.is_var_len_args:
|
||||
*head, tail = values
|
||||
return self.func(e.key, *head, *ensure_list(tail))
|
||||
args = itertools.chain.from_iterable(x if isinstance(x, list) else [x] for x in values)
|
||||
return self.func(e.key, *args)
|
||||
|
||||
return self.func(e.key, *values)
|
||||
except Exception as ex:
|
||||
|
|
|
@ -4533,6 +4533,7 @@ class DataType(Expression):
|
|||
NAME = auto()
|
||||
NCHAR = auto()
|
||||
NESTED = auto()
|
||||
NOTHING = auto()
|
||||
NULL = auto()
|
||||
NUMMULTIRANGE = auto()
|
||||
NUMRANGE = auto()
|
||||
|
@ -5752,7 +5753,7 @@ class DateSub(Func, IntervalOp):
|
|||
|
||||
class DateDiff(Func, TimeUnit):
|
||||
_sql_names = ["DATEDIFF", "DATE_DIFF"]
|
||||
arg_types = {"this": True, "expression": True, "unit": False}
|
||||
arg_types = {"this": True, "expression": True, "unit": False, "zone": False}
|
||||
|
||||
|
||||
class DateTrunc(Func):
|
||||
|
@ -7865,7 +7866,7 @@ def parse_identifier(name: str | Identifier, dialect: DialectType = None) -> Ide
|
|||
return expression
|
||||
|
||||
|
||||
INTERVAL_STRING_RE = re.compile(r"\s*([0-9]+)\s*([a-zA-Z]+)\s*")
|
||||
INTERVAL_STRING_RE = re.compile(r"\s*(-?[0-9]+)\s*([a-zA-Z]+)\s*")
|
||||
|
||||
|
||||
def to_interval(interval: str | Literal) -> Interval:
|
||||
|
|
|
@ -2782,7 +2782,9 @@ class Generator(metaclass=_Generator):
|
|||
if not partition and not order and not spec and alias:
|
||||
return f"{this} {alias}"
|
||||
|
||||
args = " ".join(arg for arg in (alias, first, partition, order, spec) if arg)
|
||||
args = self.format_args(
|
||||
*[arg for arg in (alias, first, partition, order, spec) if arg], sep=" "
|
||||
)
|
||||
return f"{this} ({args})"
|
||||
|
||||
def partition_by_sql(self, expression: exp.Window | exp.MatchRecognize) -> str:
|
||||
|
|
|
@ -211,16 +211,31 @@ def while_changing(expression: Expression, func: t.Callable[[Expression], E]) ->
|
|||
Returns:
|
||||
The transformed expression.
|
||||
"""
|
||||
while True:
|
||||
for n in reversed(tuple(expression.walk())):
|
||||
n._hash = hash(n)
|
||||
end_hash: t.Optional[int] = None
|
||||
|
||||
start = hash(expression)
|
||||
while True:
|
||||
# No need to walk the AST– we've already cached the hashes in the previous iteration
|
||||
if end_hash is None:
|
||||
for n in reversed(tuple(expression.walk())):
|
||||
n._hash = hash(n)
|
||||
|
||||
start_hash = hash(expression)
|
||||
expression = func(expression)
|
||||
|
||||
for n in expression.walk():
|
||||
expression_nodes = tuple(expression.walk())
|
||||
|
||||
# Uncache previous caches so we can recompute them
|
||||
for n in reversed(expression_nodes):
|
||||
n._hash = None
|
||||
if start == hash(expression):
|
||||
n._hash = hash(n)
|
||||
|
||||
end_hash = hash(expression)
|
||||
|
||||
if start_hash == end_hash:
|
||||
# ... and reset the hash so we don't risk it becoming out of date if a mutation happens
|
||||
for n in expression_nodes:
|
||||
n._hash = None
|
||||
|
||||
break
|
||||
|
||||
return expression
|
||||
|
|
|
@ -5,7 +5,7 @@ import typing as t
|
|||
from collections import defaultdict
|
||||
|
||||
from sqlglot import expressions as exp
|
||||
from sqlglot.helper import find_new_name
|
||||
from sqlglot.helper import find_new_name, seq_get
|
||||
from sqlglot.optimizer.scope import Scope, traverse_scope
|
||||
|
||||
if t.TYPE_CHECKING:
|
||||
|
@ -217,6 +217,7 @@ def _mergeable(
|
|||
and not _is_a_window_expression_in_unmergable_operation()
|
||||
and not _is_recursive()
|
||||
and not (inner_select.args.get("order") and outer_scope.is_union)
|
||||
and not isinstance(seq_get(inner_select.expressions, 0), exp.QueryTransform)
|
||||
)
|
||||
|
||||
|
||||
|
|
|
@ -5,6 +5,7 @@ from sqlglot.optimizer.qualify_columns import Resolver
|
|||
from sqlglot.optimizer.scope import Scope, traverse_scope
|
||||
from sqlglot.schema import ensure_schema
|
||||
from sqlglot.errors import OptimizeError
|
||||
from sqlglot.helper import seq_get
|
||||
|
||||
# Sentinel value that means an outer query selecting ALL columns
|
||||
SELECT_ALL = object()
|
||||
|
@ -92,7 +93,13 @@ def pushdown_projections(expression, schema=None, remove_unused_selections=True)
|
|||
# Push the selected columns down to the next scope
|
||||
for name, (node, source) in scope.selected_sources.items():
|
||||
if isinstance(source, Scope):
|
||||
columns = {SELECT_ALL} if scope.pivots else selects.get(name) or set()
|
||||
select = seq_get(source.expression.selects, 0)
|
||||
|
||||
if scope.pivots or isinstance(select, exp.QueryTransform):
|
||||
columns = {SELECT_ALL}
|
||||
else:
|
||||
columns = selects.get(name) or set()
|
||||
|
||||
referenced_columns[source].update(columns)
|
||||
|
||||
column_aliases = node.alias_column_names
|
||||
|
|
|
@ -770,7 +770,7 @@ def qualify_outputs(scope_or_expression: Scope | exp.Expression) -> None:
|
|||
for i, (selection, aliased_column) in enumerate(
|
||||
itertools.zip_longest(scope.expression.selects, scope.outer_columns)
|
||||
):
|
||||
if selection is None:
|
||||
if selection is None or isinstance(selection, exp.QueryTransform):
|
||||
break
|
||||
|
||||
if isinstance(selection, exp.Subquery):
|
||||
|
@ -787,7 +787,7 @@ def qualify_outputs(scope_or_expression: Scope | exp.Expression) -> None:
|
|||
|
||||
new_selections.append(selection)
|
||||
|
||||
if isinstance(scope.expression, exp.Select):
|
||||
if new_selections and isinstance(scope.expression, exp.Select):
|
||||
scope.expression.set("expressions", new_selections)
|
||||
|
||||
|
||||
|
@ -945,7 +945,14 @@ class Resolver:
|
|||
else:
|
||||
columns = set_op.named_selects
|
||||
else:
|
||||
columns = source.expression.named_selects
|
||||
select = seq_get(source.expression.selects, 0)
|
||||
|
||||
if isinstance(select, exp.QueryTransform):
|
||||
# https://spark.apache.org/docs/3.5.1/sql-ref-syntax-qry-select-transform.html
|
||||
schema = select.args.get("schema")
|
||||
columns = [c.name for c in schema.expressions] if schema else ["key", "value"]
|
||||
else:
|
||||
columns = source.expression.named_selects
|
||||
|
||||
node, _ = self.scope.selected_sources.get(name) or (None, None)
|
||||
if isinstance(node, Scope):
|
||||
|
|
|
@ -54,10 +54,10 @@ def qualify_tables(
|
|||
|
||||
def _qualify(table: exp.Table) -> None:
|
||||
if isinstance(table.this, exp.Identifier):
|
||||
if not table.args.get("db"):
|
||||
table.set("db", db)
|
||||
if not table.args.get("catalog") and table.args.get("db"):
|
||||
table.set("catalog", catalog)
|
||||
if db and not table.args.get("db"):
|
||||
table.set("db", db.copy())
|
||||
if catalog and not table.args.get("catalog") and table.args.get("db"):
|
||||
table.set("catalog", catalog.copy())
|
||||
|
||||
if (db or catalog) and not isinstance(expression, exp.Query):
|
||||
for node in expression.walk(prune=lambda n: isinstance(n, exp.Query)):
|
||||
|
@ -148,6 +148,7 @@ def qualify_tables(
|
|||
if table_alias:
|
||||
for p in exp.COLUMN_PARTS[1:]:
|
||||
column.set(p, None)
|
||||
column.set("table", table_alias)
|
||||
|
||||
column.set("table", table_alias.copy())
|
||||
|
||||
return expression
|
||||
|
|
|
@ -40,7 +40,6 @@ def simplify(
|
|||
expression: exp.Expression,
|
||||
constant_propagation: bool = False,
|
||||
dialect: DialectType = None,
|
||||
max_depth: t.Optional[int] = None,
|
||||
):
|
||||
"""
|
||||
Rewrite sqlglot AST to simplify expressions.
|
||||
|
@ -54,114 +53,99 @@ def simplify(
|
|||
Args:
|
||||
expression: expression to simplify
|
||||
constant_propagation: whether the constant propagation rule should be used
|
||||
max_depth: Chains of Connectors (AND, OR, etc) exceeding `max_depth` will be skipped
|
||||
Returns:
|
||||
sqlglot.Expression: simplified expression
|
||||
"""
|
||||
|
||||
dialect = Dialect.get_or_raise(dialect)
|
||||
|
||||
def _simplify(expression, root=True):
|
||||
if (
|
||||
max_depth
|
||||
and isinstance(expression, exp.Connector)
|
||||
and not isinstance(expression.parent, exp.Connector)
|
||||
):
|
||||
depth = connector_depth(expression)
|
||||
if depth > max_depth:
|
||||
logger.info(
|
||||
f"Skipping simplification because connector depth {depth} exceeds max {max_depth}"
|
||||
)
|
||||
return expression
|
||||
def _simplify(expression):
|
||||
pre_transformation_stack = [expression]
|
||||
post_transformation_stack = []
|
||||
|
||||
if expression.meta.get(FINAL):
|
||||
return expression
|
||||
while pre_transformation_stack:
|
||||
node = pre_transformation_stack.pop()
|
||||
|
||||
# group by expressions cannot be simplified, for example
|
||||
# select x + 1 + 1 FROM y GROUP BY x + 1 + 1
|
||||
# the projection must exactly match the group by key
|
||||
group = expression.args.get("group")
|
||||
if node.meta.get(FINAL):
|
||||
continue
|
||||
|
||||
if group and hasattr(expression, "selects"):
|
||||
groups = set(group.expressions)
|
||||
group.meta[FINAL] = True
|
||||
# group by expressions cannot be simplified, for example
|
||||
# select x + 1 + 1 FROM y GROUP BY x + 1 + 1
|
||||
# the projection must exactly match the group by key
|
||||
group = node.args.get("group")
|
||||
|
||||
for e in expression.selects:
|
||||
for node in e.walk():
|
||||
if node in groups:
|
||||
e.meta[FINAL] = True
|
||||
break
|
||||
if group and hasattr(node, "selects"):
|
||||
groups = set(group.expressions)
|
||||
group.meta[FINAL] = True
|
||||
|
||||
having = expression.args.get("having")
|
||||
if having:
|
||||
for node in having.walk():
|
||||
if node in groups:
|
||||
having.meta[FINAL] = True
|
||||
break
|
||||
for s in node.selects:
|
||||
for n in s.walk():
|
||||
if n in groups:
|
||||
s.meta[FINAL] = True
|
||||
break
|
||||
|
||||
# Pre-order transformations
|
||||
node = expression
|
||||
node = rewrite_between(node)
|
||||
node = uniq_sort(node, root)
|
||||
node = absorb_and_eliminate(node, root)
|
||||
node = simplify_concat(node)
|
||||
node = simplify_conditionals(node)
|
||||
having = node.args.get("having")
|
||||
if having:
|
||||
for n in having.walk():
|
||||
if n in groups:
|
||||
having.meta[FINAL] = True
|
||||
break
|
||||
|
||||
if constant_propagation:
|
||||
node = propagate_constants(node, root)
|
||||
parent = node.parent
|
||||
root = node is expression
|
||||
|
||||
exp.replace_children(node, lambda e: _simplify(e, False))
|
||||
new_node = rewrite_between(node)
|
||||
new_node = uniq_sort(new_node, root)
|
||||
new_node = absorb_and_eliminate(new_node, root)
|
||||
new_node = simplify_concat(new_node)
|
||||
new_node = simplify_conditionals(new_node)
|
||||
|
||||
# Post-order transformations
|
||||
node = simplify_not(node)
|
||||
node = flatten(node)
|
||||
node = simplify_connectors(node, root)
|
||||
node = remove_complements(node, root)
|
||||
node = simplify_coalesce(node, dialect)
|
||||
node.parent = expression.parent
|
||||
node = simplify_literals(node, root)
|
||||
node = simplify_equality(node)
|
||||
node = simplify_parens(node)
|
||||
node = simplify_datetrunc(node, dialect)
|
||||
node = sort_comparison(node)
|
||||
node = simplify_startswith(node)
|
||||
if constant_propagation:
|
||||
new_node = propagate_constants(new_node, root)
|
||||
|
||||
if root:
|
||||
expression.replace(node)
|
||||
return node
|
||||
if new_node is not node:
|
||||
node.replace(new_node)
|
||||
|
||||
pre_transformation_stack.extend(
|
||||
n for n in new_node.iter_expressions(reverse=True) if not n.meta.get(FINAL)
|
||||
)
|
||||
post_transformation_stack.append((new_node, parent))
|
||||
|
||||
while post_transformation_stack:
|
||||
node, parent = post_transformation_stack.pop()
|
||||
root = node is expression
|
||||
|
||||
# Resets parent, arg_key, index pointers– this is needed because some of the
|
||||
# previous transformations mutate the AST, leading to an inconsistent state
|
||||
for k, v in tuple(node.args.items()):
|
||||
node.set(k, v)
|
||||
|
||||
# Post-order transformations
|
||||
new_node = simplify_not(node)
|
||||
new_node = flatten(new_node)
|
||||
new_node = simplify_connectors(new_node, root)
|
||||
new_node = remove_complements(new_node, root)
|
||||
new_node = simplify_coalesce(new_node, dialect)
|
||||
|
||||
new_node.parent = parent
|
||||
|
||||
new_node = simplify_literals(new_node, root)
|
||||
new_node = simplify_equality(new_node)
|
||||
new_node = simplify_parens(new_node)
|
||||
new_node = simplify_datetrunc(new_node, dialect)
|
||||
new_node = sort_comparison(new_node)
|
||||
new_node = simplify_startswith(new_node)
|
||||
|
||||
if new_node is not node:
|
||||
node.replace(new_node)
|
||||
|
||||
return new_node
|
||||
|
||||
expression = while_changing(expression, _simplify)
|
||||
remove_where_true(expression)
|
||||
return expression
|
||||
|
||||
|
||||
def connector_depth(expression: exp.Expression) -> int:
|
||||
"""
|
||||
Determine the maximum depth of a tree of Connectors.
|
||||
|
||||
For example:
|
||||
>>> from sqlglot import parse_one
|
||||
>>> connector_depth(parse_one("a AND b AND c AND d"))
|
||||
3
|
||||
"""
|
||||
stack = deque([(expression, 0)])
|
||||
max_depth = 0
|
||||
|
||||
while stack:
|
||||
expression, depth = stack.pop()
|
||||
|
||||
if not isinstance(expression, exp.Connector):
|
||||
continue
|
||||
|
||||
depth += 1
|
||||
max_depth = max(depth, max_depth)
|
||||
|
||||
stack.append((expression.left, depth))
|
||||
stack.append((expression.right, depth))
|
||||
|
||||
return max_depth
|
||||
|
||||
|
||||
def catch(*exceptions):
|
||||
"""Decorator that ignores a simplification function if any of `exceptions` are raised"""
|
||||
|
||||
|
|
|
@ -397,6 +397,7 @@ class Parser(metaclass=_Parser):
|
|||
TokenType.IMAGE,
|
||||
TokenType.VARIANT,
|
||||
TokenType.VECTOR,
|
||||
TokenType.VOID,
|
||||
TokenType.OBJECT,
|
||||
TokenType.OBJECT_IDENTIFIER,
|
||||
TokenType.INET,
|
||||
|
@ -405,6 +406,7 @@ class Parser(metaclass=_Parser):
|
|||
TokenType.IPV4,
|
||||
TokenType.IPV6,
|
||||
TokenType.UNKNOWN,
|
||||
TokenType.NOTHING,
|
||||
TokenType.NULL,
|
||||
TokenType.NAME,
|
||||
TokenType.TDIGEST,
|
||||
|
@ -579,6 +581,8 @@ class Parser(metaclass=_Parser):
|
|||
|
||||
ALIAS_TOKENS = ID_VAR_TOKENS
|
||||
|
||||
COLON_PLACEHOLDER_TOKENS = ID_VAR_TOKENS
|
||||
|
||||
ARRAY_CONSTRUCTORS = {
|
||||
"ARRAY": exp.Array,
|
||||
"LIST": exp.List,
|
||||
|
@ -799,6 +803,7 @@ class Parser(metaclass=_Parser):
|
|||
exp.Order: lambda self: self._parse_order(),
|
||||
exp.Ordered: lambda self: self._parse_ordered(),
|
||||
exp.Properties: lambda self: self._parse_properties(),
|
||||
exp.PartitionedByProperty: lambda self: self._parse_partitioned_by(),
|
||||
exp.Qualify: lambda self: self._parse_qualify(),
|
||||
exp.Returning: lambda self: self._parse_returning(),
|
||||
exp.Select: lambda self: self._parse_select(),
|
||||
|
@ -900,7 +905,7 @@ class Parser(metaclass=_Parser):
|
|||
TokenType.PARAMETER: lambda self: self._parse_parameter(),
|
||||
TokenType.COLON: lambda self: (
|
||||
self.expression(exp.Placeholder, this=self._prev.text)
|
||||
if self._match_set(self.ID_VAR_TOKENS)
|
||||
if self._match_set(self.COLON_PLACEHOLDER_TOKENS)
|
||||
else None
|
||||
),
|
||||
}
|
||||
|
@ -1999,7 +2004,7 @@ class Parser(metaclass=_Parser):
|
|||
# exp.Properties.Location.POST_SCHEMA and POST_WITH
|
||||
extend_props(self._parse_properties())
|
||||
|
||||
self._match(TokenType.ALIAS)
|
||||
has_alias = self._match(TokenType.ALIAS)
|
||||
if not self._match_set(self.DDL_SELECT_TOKENS, advance=False):
|
||||
# exp.Properties.Location.POST_ALIAS
|
||||
extend_props(self._parse_properties())
|
||||
|
@ -2010,6 +2015,11 @@ class Parser(metaclass=_Parser):
|
|||
else:
|
||||
expression = self._parse_ddl_select()
|
||||
|
||||
# Some dialects also support using a table as an alias instead of a SELECT.
|
||||
# Here we fallback to this as an alternative.
|
||||
if not expression and has_alias:
|
||||
expression = self._try_parse(self._parse_table_parts)
|
||||
|
||||
if create_token.token_type == TokenType.TABLE:
|
||||
# exp.Properties.Location.POST_EXPRESSION
|
||||
extend_props(self._parse_properties())
|
||||
|
@ -5229,6 +5239,8 @@ class Parser(metaclass=_Parser):
|
|||
this = self.expression(exp.DataType, this=self.expression(exp.Interval, unit=unit))
|
||||
else:
|
||||
this = self.expression(exp.DataType, this=exp.DataType.Type.INTERVAL)
|
||||
elif type_token == TokenType.VOID:
|
||||
this = exp.DataType(this=exp.DataType.Type.NULL)
|
||||
|
||||
if maybe_func and check_func:
|
||||
index2 = self._index
|
||||
|
@ -7416,7 +7428,7 @@ class Parser(metaclass=_Parser):
|
|||
if self._match_text_seq("WITH", "SYNC", "MODE") or self._match_text_seq(
|
||||
"WITH", "ASYNC", "MODE"
|
||||
):
|
||||
mode = f"WITH {self._tokens[self._index-2].text.upper()} MODE"
|
||||
mode = f"WITH {self._tokens[self._index - 2].text.upper()} MODE"
|
||||
else:
|
||||
mode = None
|
||||
|
||||
|
|
|
@ -222,6 +222,7 @@ class TokenType(AutoName):
|
|||
UNKNOWN = auto()
|
||||
VECTOR = auto()
|
||||
DYNAMIC = auto()
|
||||
VOID = auto()
|
||||
|
||||
# keywords
|
||||
ALIAS = auto()
|
||||
|
@ -333,6 +334,7 @@ class TokenType(AutoName):
|
|||
MODEL = auto()
|
||||
NATURAL = auto()
|
||||
NEXT = auto()
|
||||
NOTHING = auto()
|
||||
NOTNULL = auto()
|
||||
NULL = auto()
|
||||
OBJECT_IDENTIFIER = auto()
|
||||
|
|
|
@ -276,15 +276,17 @@ class TestAthena(Validator):
|
|||
exp.FileFormatProperty(this=exp.Literal.string("parquet")),
|
||||
exp.LocationProperty(this=exp.Literal.string("s3://foo")),
|
||||
exp.PartitionedByProperty(
|
||||
this=exp.Schema(expressions=[exp.to_column("partition_col")])
|
||||
this=exp.Schema(expressions=[exp.to_column("partition_col", quoted=True)])
|
||||
),
|
||||
]
|
||||
),
|
||||
expression=exp.select("1"),
|
||||
)
|
||||
|
||||
# Even if identify=True, the column names should not be quoted within the string literals in the partitioned_by ARRAY[]
|
||||
self.assertEqual(
|
||||
ctas_hive.sql(dialect=self.dialect, identify=True),
|
||||
"CREATE TABLE \"foo\".\"bar\" WITH (format='parquet', external_location='s3://foo', partitioned_by=ARRAY['\"partition_col\"']) AS SELECT 1",
|
||||
"CREATE TABLE \"foo\".\"bar\" WITH (format='parquet', external_location='s3://foo', partitioned_by=ARRAY['partition_col']) AS SELECT 1",
|
||||
)
|
||||
self.assertEqual(
|
||||
ctas_hive.sql(dialect=self.dialect, identify=False),
|
||||
|
@ -303,7 +305,8 @@ class TestAthena(Validator):
|
|||
expressions=[
|
||||
exp.to_column("partition_col"),
|
||||
exp.PartitionedByBucket(
|
||||
this=exp.to_column("a"), expression=exp.Literal.number(4)
|
||||
this=exp.to_column("a", quoted=True),
|
||||
expression=exp.Literal.number(4),
|
||||
),
|
||||
]
|
||||
)
|
||||
|
@ -312,11 +315,25 @@ class TestAthena(Validator):
|
|||
),
|
||||
expression=exp.select("1"),
|
||||
)
|
||||
# Even if identify=True, the column names should not be quoted within the string literals in the partitioning ARRAY[]
|
||||
# Technically Trino's Iceberg connector does support quoted column names in the string literals but its undocumented
|
||||
# so we dont do it to keep consistency with the Hive connector
|
||||
self.assertEqual(
|
||||
ctas_iceberg.sql(dialect=self.dialect, identify=True),
|
||||
"CREATE TABLE \"foo\".\"bar\" WITH (table_type='iceberg', location='s3://foo', partitioning=ARRAY['\"partition_col\"', 'BUCKET(\"a\", 4)']) AS SELECT 1",
|
||||
"CREATE TABLE \"foo\".\"bar\" WITH (table_type='iceberg', location='s3://foo', partitioning=ARRAY['partition_col', 'BUCKET(a, 4)']) AS SELECT 1",
|
||||
)
|
||||
self.assertEqual(
|
||||
ctas_iceberg.sql(dialect=self.dialect, identify=False),
|
||||
"CREATE TABLE foo.bar WITH (table_type='iceberg', location='s3://foo', partitioning=ARRAY['partition_col', 'BUCKET(a, 4)']) AS SELECT 1",
|
||||
)
|
||||
|
||||
def test_parse_partitioned_by_returns_iceberg_transforms(self):
|
||||
# check that parse_into works for PartitionedByProperty and also that correct AST nodes are emitted for Iceberg transforms
|
||||
parsed = self.parse_one(
|
||||
"(a, bucket(4, b), truncate(3, c), month(d))", into=exp.PartitionedByProperty
|
||||
)
|
||||
|
||||
assert isinstance(parsed, exp.PartitionedByProperty)
|
||||
assert isinstance(parsed.this, exp.Schema)
|
||||
assert next(n for n in parsed.this.expressions if isinstance(n, exp.PartitionedByBucket))
|
||||
assert next(n for n in parsed.this.expressions if isinstance(n, exp.PartitionByTruncate))
|
||||
|
|
|
@ -448,14 +448,13 @@ LANGUAGE js AS
|
|||
"SELECT SUM(x RESPECT NULLS) AS x",
|
||||
read={
|
||||
"bigquery": "SELECT SUM(x RESPECT NULLS) AS x",
|
||||
"duckdb": "SELECT SUM(x RESPECT NULLS) AS x",
|
||||
"postgres": "SELECT SUM(x) RESPECT NULLS AS x",
|
||||
"spark": "SELECT SUM(x) RESPECT NULLS AS x",
|
||||
"snowflake": "SELECT SUM(x) RESPECT NULLS AS x",
|
||||
},
|
||||
write={
|
||||
"bigquery": "SELECT SUM(x RESPECT NULLS) AS x",
|
||||
"duckdb": "SELECT SUM(x RESPECT NULLS) AS x",
|
||||
"duckdb": "SELECT SUM(x) AS x",
|
||||
"postgres": "SELECT SUM(x) RESPECT NULLS AS x",
|
||||
"spark": "SELECT SUM(x) RESPECT NULLS AS x",
|
||||
"snowflake": "SELECT SUM(x) RESPECT NULLS AS x",
|
||||
|
@ -465,7 +464,7 @@ LANGUAGE js AS
|
|||
"SELECT PERCENTILE_CONT(x, 0.5 RESPECT NULLS) OVER ()",
|
||||
write={
|
||||
"bigquery": "SELECT PERCENTILE_CONT(x, 0.5 RESPECT NULLS) OVER ()",
|
||||
"duckdb": "SELECT QUANTILE_CONT(x, 0.5 RESPECT NULLS) OVER ()",
|
||||
"duckdb": "SELECT QUANTILE_CONT(x, 0.5) OVER ()",
|
||||
"spark": "SELECT PERCENTILE_CONT(x, 0.5) RESPECT NULLS OVER ()",
|
||||
},
|
||||
)
|
||||
|
|
|
@ -739,6 +739,12 @@ class TestClickhouse(Validator):
|
|||
with self.subTest(f"Casting to ClickHouse {data_type}"):
|
||||
self.validate_identity(f"SELECT CAST(val AS {data_type})")
|
||||
|
||||
def test_nothing_type(self):
|
||||
data_types = ["Nothing", "Nullable(Nothing)"]
|
||||
for data_type in data_types:
|
||||
with self.subTest(f"Casting to ClickHouse {data_type}"):
|
||||
self.validate_identity(f"SELECT CAST(val AS {data_type})")
|
||||
|
||||
def test_aggregate_function_column_with_any_keyword(self):
|
||||
# Regression test for https://github.com/tobymao/sqlglot/issues/4723
|
||||
self.validate_all(
|
||||
|
@ -766,6 +772,17 @@ ORDER BY (
|
|||
pretty=True,
|
||||
)
|
||||
|
||||
def test_create_table_as_alias(self):
|
||||
ctas_alias = "CREATE TABLE my_db.my_table AS another_db.another_table"
|
||||
|
||||
expected = exp.Create(
|
||||
this=exp.to_table("my_db.my_table"),
|
||||
kind="TABLE",
|
||||
expression=exp.to_table("another_db.another_table"),
|
||||
)
|
||||
self.assertEqual(self.parse_one(ctas_alias), expected)
|
||||
self.validate_identity(ctas_alias)
|
||||
|
||||
def test_ddl(self):
|
||||
db_table_expr = exp.Table(this=None, db=exp.to_identifier("foo"), catalog=None)
|
||||
create_with_cluster = exp.Create(
|
||||
|
@ -1220,6 +1237,15 @@ LIFETIME(MIN 0 MAX 0)""",
|
|||
f"SELECT {func_alias}(SECOND, 1, bar)",
|
||||
f"SELECT {func_name}(SECOND, 1, bar)",
|
||||
)
|
||||
# 4-arg functions of type <func>(unit, value, date, timezone)
|
||||
for func in (("DATE_DIFF", "DATEDIFF"),):
|
||||
func_name = func[0]
|
||||
for func_alias in func:
|
||||
with self.subTest(f"Test 4-arg date-time function {func_alias}"):
|
||||
self.validate_identity(
|
||||
f"SELECT {func_alias}(SECOND, 1, bar, 'UTC')",
|
||||
f"SELECT {func_name}(SECOND, 1, bar, 'UTC')",
|
||||
)
|
||||
|
||||
def test_convert(self):
|
||||
self.assertEqual(
|
||||
|
|
|
@ -7,6 +7,12 @@ class TestDatabricks(Validator):
|
|||
dialect = "databricks"
|
||||
|
||||
def test_databricks(self):
|
||||
null_type = exp.DataType.build("VOID", dialect="databricks")
|
||||
self.assertEqual(null_type.sql(), "NULL")
|
||||
self.assertEqual(null_type.sql("databricks"), "VOID")
|
||||
|
||||
self.validate_identity("SELECT CAST(NULL AS VOID)")
|
||||
self.validate_identity("SELECT void FROM t")
|
||||
self.validate_identity("SELECT * FROM stream")
|
||||
self.validate_identity("SELECT t.current_time FROM t")
|
||||
self.validate_identity("ALTER TABLE labels ADD COLUMN label_score FLOAT")
|
||||
|
@ -89,7 +95,7 @@ class TestDatabricks(Validator):
|
|||
self.validate_all(
|
||||
"CREATE TABLE foo (x INT GENERATED ALWAYS AS (YEAR(y)))",
|
||||
write={
|
||||
"databricks": "CREATE TABLE foo (x INT GENERATED ALWAYS AS (YEAR(TO_DATE(y))))",
|
||||
"databricks": "CREATE TABLE foo (x INT GENERATED ALWAYS AS (YEAR(y)))",
|
||||
"tsql": "CREATE TABLE foo (x AS YEAR(CAST(y AS DATE)))",
|
||||
},
|
||||
)
|
||||
|
|
Some files were not shown because too many files have changed in this diff Show more
Loading…
Add table
Add a link
Reference in a new issue