1
0
Fork 0

Merging upstream version 25.32.0.

Signed-off-by: Daniel Baumann <daniel@debian.org>
This commit is contained in:
Daniel Baumann 2025-02-13 21:57:37 +01:00
parent 160ab5bf81
commit 02152e9ba6
Signed by: daniel
GPG key ID: FBB4F0E80A80222F
74 changed files with 2284 additions and 1814 deletions

View file

@ -1,6 +1,11 @@
Changelog
=========
## [v25.31.4] - 2024-11-17
### :bug: Bug Fixes
- [`59b8b6d`](https://github.com/tobymao/sqlglot/commit/59b8b6d1409b4112d425cc31db45519d5936b6fa) - preserve column quoting in DISTINCT ON elimination *(commit by [@georgesittas](https://github.com/georgesittas))*
## [v25.31.3] - 2024-11-17
### :sparkles: New Features
- [`835e717`](https://github.com/tobymao/sqlglot/commit/835e71795f994599dbc19f1a5969b464154926e1) - **clickhouse**: transform function support *(PR [#4408](https://github.com/tobymao/sqlglot/pull/4408) by [@GaliFFun](https://github.com/GaliFFun))*
@ -5313,3 +5318,4 @@ Changelog
[v25.31.1]: https://github.com/tobymao/sqlglot/compare/v25.31.0...v25.31.1
[v25.31.2]: https://github.com/tobymao/sqlglot/compare/v25.31.1...v25.31.2
[v25.31.3]: https://github.com/tobymao/sqlglot/compare/v25.31.2...v25.31.3
[v25.31.4]: https://github.com/tobymao/sqlglot/compare/v25.31.3...v25.31.4

View file

@ -1,6 +1,6 @@
![SQLGlot logo](sqlglot.png)
SQLGlot is a no-dependency SQL parser, transpiler, optimizer, and engine. It can be used to format SQL or translate between [23 different dialects](https://github.com/tobymao/sqlglot/blob/main/sqlglot/dialects/__init__.py) like [DuckDB](https://duckdb.org/), [Presto](https://prestodb.io/) / [Trino](https://trino.io/), [Spark](https://spark.apache.org/) / [Databricks](https://www.databricks.com/), [Snowflake](https://www.snowflake.com/en/), and [BigQuery](https://cloud.google.com/bigquery/). It aims to read a wide variety of SQL inputs and output syntactically and semantically correct SQL in the targeted dialects.
SQLGlot is a no-dependency SQL parser, transpiler, optimizer, and engine. It can be used to format SQL or translate between [24 different dialects](https://github.com/tobymao/sqlglot/blob/main/sqlglot/dialects/__init__.py) like [DuckDB](https://duckdb.org/), [Presto](https://prestodb.io/) / [Trino](https://trino.io/), [Spark](https://spark.apache.org/) / [Databricks](https://www.databricks.com/), [Snowflake](https://www.snowflake.com/en/), and [BigQuery](https://cloud.google.com/bigquery/). It aims to read a wide variety of SQL inputs and output syntactically and semantically correct SQL in the targeted dialects.
It is a very comprehensive generic SQL parser with a robust [test suite](https://github.com/tobymao/sqlglot/blob/main/tests/). It is also quite [performant](#benchmarks), while being written purely in Python.

File diff suppressed because one or more lines are too long

View file

@ -76,8 +76,8 @@
</span><span id="L-12"><a href="#L-12"><span class="linenos">12</span></a><span class="n">__version_tuple__</span><span class="p">:</span> <span class="n">VERSION_TUPLE</span>
</span><span id="L-13"><a href="#L-13"><span class="linenos">13</span></a><span class="n">version_tuple</span><span class="p">:</span> <span class="n">VERSION_TUPLE</span>
</span><span id="L-14"><a href="#L-14"><span class="linenos">14</span></a>
</span><span id="L-15"><a href="#L-15"><span class="linenos">15</span></a><span class="n">__version__</span> <span class="o">=</span> <span class="n">version</span> <span class="o">=</span> <span class="s1">&#39;25.31.3&#39;</span>
</span><span id="L-16"><a href="#L-16"><span class="linenos">16</span></a><span class="n">__version_tuple__</span> <span class="o">=</span> <span class="n">version_tuple</span> <span class="o">=</span> <span class="p">(</span><span class="mi">25</span><span class="p">,</span> <span class="mi">31</span><span class="p">,</span> <span class="mi">3</span><span class="p">)</span>
</span><span id="L-15"><a href="#L-15"><span class="linenos">15</span></a><span class="n">__version__</span> <span class="o">=</span> <span class="n">version</span> <span class="o">=</span> <span class="s1">&#39;25.31.4&#39;</span>
</span><span id="L-16"><a href="#L-16"><span class="linenos">16</span></a><span class="n">__version_tuple__</span> <span class="o">=</span> <span class="n">version_tuple</span> <span class="o">=</span> <span class="p">(</span><span class="mi">25</span><span class="p">,</span> <span class="mi">31</span><span class="p">,</span> <span class="mi">4</span><span class="p">)</span>
</span></pre></div>
@ -97,7 +97,7 @@
<section id="version">
<div class="attr variable">
<span class="name">version</span><span class="annotation">: str</span> =
<span class="default_value">&#39;25.31.3&#39;</span>
<span class="default_value">&#39;25.31.4&#39;</span>
</div>
@ -109,7 +109,7 @@
<section id="version_tuple">
<div class="attr variable">
<span class="name">version_tuple</span><span class="annotation">: object</span> =
<span class="default_value">(25, 31, 3)</span>
<span class="default_value">(25, 31, 4)</span>
</div>

File diff suppressed because one or more lines are too long

File diff suppressed because one or more lines are too long

File diff suppressed because one or more lines are too long

File diff suppressed because one or more lines are too long

File diff suppressed because one or more lines are too long

File diff suppressed because one or more lines are too long

File diff suppressed because one or more lines are too long

File diff suppressed because one or more lines are too long

File diff suppressed because one or more lines are too long

File diff suppressed because one or more lines are too long

File diff suppressed because one or more lines are too long

File diff suppressed because one or more lines are too long

File diff suppressed because one or more lines are too long

File diff suppressed because one or more lines are too long

File diff suppressed because one or more lines are too long

File diff suppressed because one or more lines are too long

File diff suppressed because one or more lines are too long

File diff suppressed because one or more lines are too long

File diff suppressed because one or more lines are too long

File diff suppressed because one or more lines are too long

File diff suppressed because one or more lines are too long

File diff suppressed because one or more lines are too long

File diff suppressed because one or more lines are too long

File diff suppressed because one or more lines are too long

File diff suppressed because one or more lines are too long

View file

@ -58980,7 +58980,7 @@ Otherwise, this resets the expressions.</li>
<div id="DataType.STRUCT_TYPES" class="classattr">
<div class="attr variable">
<span class="name">STRUCT_TYPES</span> =
<span class="default_value">{&lt;Type.STRUCT: &#39;STRUCT&#39;&gt;, &lt;Type.NESTED: &#39;NESTED&#39;&gt;, &lt;Type.OBJECT: &#39;OBJECT&#39;&gt;, &lt;Type.UNION: &#39;UNION&#39;&gt;}</span>
<span class="default_value">{&lt;Type.UNION: &#39;UNION&#39;&gt;, &lt;Type.NESTED: &#39;NESTED&#39;&gt;, &lt;Type.OBJECT: &#39;OBJECT&#39;&gt;, &lt;Type.STRUCT: &#39;STRUCT&#39;&gt;}</span>
</div>
@ -59005,7 +59005,7 @@ Otherwise, this resets the expressions.</li>
<div class="attr variable">
<span class="name">NESTED_TYPES</span> =
<input id="DataType.NESTED_TYPES-view-value" class="view-value-toggle-state" type="checkbox" aria-hidden="true" tabindex="-1">
<label class="view-value-button pdoc-button" for="DataType.NESTED_TYPES-view-value"></label><span class="default_value">{&lt;Type.STRUCT: &#39;STRUCT&#39;&gt;, &lt;Type.OBJECT: &#39;OBJECT&#39;&gt;, &lt;Type.MAP: &#39;MAP&#39;&gt;, &lt;Type.NESTED: &#39;NESTED&#39;&gt;, &lt;Type.LIST: &#39;LIST&#39;&gt;, &lt;Type.UNION: &#39;UNION&#39;&gt;, &lt;Type.ARRAY: &#39;ARRAY&#39;&gt;}</span>
<label class="view-value-button pdoc-button" for="DataType.NESTED_TYPES-view-value"></label><span class="default_value">{&lt;Type.UNION: &#39;UNION&#39;&gt;, &lt;Type.NESTED: &#39;NESTED&#39;&gt;, &lt;Type.LIST: &#39;LIST&#39;&gt;, &lt;Type.MAP: &#39;MAP&#39;&gt;, &lt;Type.STRUCT: &#39;STRUCT&#39;&gt;, &lt;Type.ARRAY: &#39;ARRAY&#39;&gt;, &lt;Type.OBJECT: &#39;OBJECT&#39;&gt;}</span>
</div>
@ -59018,7 +59018,7 @@ Otherwise, this resets the expressions.</li>
<div class="attr variable">
<span class="name">TEXT_TYPES</span> =
<input id="DataType.TEXT_TYPES-view-value" class="view-value-toggle-state" type="checkbox" aria-hidden="true" tabindex="-1">
<label class="view-value-button pdoc-button" for="DataType.TEXT_TYPES-view-value"></label><span class="default_value">{&lt;Type.CHAR: &#39;CHAR&#39;&gt;, &lt;Type.TEXT: &#39;TEXT&#39;&gt;, &lt;Type.NAME: &#39;NAME&#39;&gt;, &lt;Type.NCHAR: &#39;NCHAR&#39;&gt;, &lt;Type.NVARCHAR: &#39;NVARCHAR&#39;&gt;, &lt;Type.VARCHAR: &#39;VARCHAR&#39;&gt;}</span>
<label class="view-value-button pdoc-button" for="DataType.TEXT_TYPES-view-value"></label><span class="default_value">{&lt;Type.CHAR: &#39;CHAR&#39;&gt;, &lt;Type.NCHAR: &#39;NCHAR&#39;&gt;, &lt;Type.NVARCHAR: &#39;NVARCHAR&#39;&gt;, &lt;Type.NAME: &#39;NAME&#39;&gt;, &lt;Type.VARCHAR: &#39;VARCHAR&#39;&gt;, &lt;Type.TEXT: &#39;TEXT&#39;&gt;}</span>
</div>
@ -59031,7 +59031,7 @@ Otherwise, this resets the expressions.</li>
<div class="attr variable">
<span class="name">SIGNED_INTEGER_TYPES</span> =
<input id="DataType.SIGNED_INTEGER_TYPES-view-value" class="view-value-toggle-state" type="checkbox" aria-hidden="true" tabindex="-1">
<label class="view-value-button pdoc-button" for="DataType.SIGNED_INTEGER_TYPES-view-value"></label><span class="default_value">{&lt;Type.INT: &#39;INT&#39;&gt;, &lt;Type.BIGINT: &#39;BIGINT&#39;&gt;, &lt;Type.TINYINT: &#39;TINYINT&#39;&gt;, &lt;Type.SMALLINT: &#39;SMALLINT&#39;&gt;, &lt;Type.INT128: &#39;INT128&#39;&gt;, &lt;Type.MEDIUMINT: &#39;MEDIUMINT&#39;&gt;, &lt;Type.INT256: &#39;INT256&#39;&gt;}</span>
<label class="view-value-button pdoc-button" for="DataType.SIGNED_INTEGER_TYPES-view-value"></label><span class="default_value">{&lt;Type.INT256: &#39;INT256&#39;&gt;, &lt;Type.SMALLINT: &#39;SMALLINT&#39;&gt;, &lt;Type.BIGINT: &#39;BIGINT&#39;&gt;, &lt;Type.INT128: &#39;INT128&#39;&gt;, &lt;Type.MEDIUMINT: &#39;MEDIUMINT&#39;&gt;, &lt;Type.TINYINT: &#39;TINYINT&#39;&gt;, &lt;Type.INT: &#39;INT&#39;&gt;}</span>
</div>
@ -59044,7 +59044,7 @@ Otherwise, this resets the expressions.</li>
<div class="attr variable">
<span class="name">UNSIGNED_INTEGER_TYPES</span> =
<input id="DataType.UNSIGNED_INTEGER_TYPES-view-value" class="view-value-toggle-state" type="checkbox" aria-hidden="true" tabindex="-1">
<label class="view-value-button pdoc-button" for="DataType.UNSIGNED_INTEGER_TYPES-view-value"></label><span class="default_value">{&lt;Type.UINT: &#39;UINT&#39;&gt;, &lt;Type.USMALLINT: &#39;USMALLINT&#39;&gt;, &lt;Type.UINT128: &#39;UINT128&#39;&gt;, &lt;Type.UINT256: &#39;UINT256&#39;&gt;, &lt;Type.UMEDIUMINT: &#39;UMEDIUMINT&#39;&gt;, &lt;Type.UTINYINT: &#39;UTINYINT&#39;&gt;, &lt;Type.UBIGINT: &#39;UBIGINT&#39;&gt;}</span>
<label class="view-value-button pdoc-button" for="DataType.UNSIGNED_INTEGER_TYPES-view-value"></label><span class="default_value">{&lt;Type.UBIGINT: &#39;UBIGINT&#39;&gt;, &lt;Type.UINT256: &#39;UINT256&#39;&gt;, &lt;Type.UTINYINT: &#39;UTINYINT&#39;&gt;, &lt;Type.UMEDIUMINT: &#39;UMEDIUMINT&#39;&gt;, &lt;Type.UINT128: &#39;UINT128&#39;&gt;, &lt;Type.USMALLINT: &#39;USMALLINT&#39;&gt;, &lt;Type.UINT: &#39;UINT&#39;&gt;}</span>
</div>
@ -59057,7 +59057,7 @@ Otherwise, this resets the expressions.</li>
<div class="attr variable">
<span class="name">INTEGER_TYPES</span> =
<input id="DataType.INTEGER_TYPES-view-value" class="view-value-toggle-state" type="checkbox" aria-hidden="true" tabindex="-1">
<label class="view-value-button pdoc-button" for="DataType.INTEGER_TYPES-view-value"></label><span class="default_value">{&lt;Type.UINT: &#39;UINT&#39;&gt;, &lt;Type.USMALLINT: &#39;USMALLINT&#39;&gt;, &lt;Type.INT: &#39;INT&#39;&gt;, &lt;Type.BIGINT: &#39;BIGINT&#39;&gt;, &lt;Type.TINYINT: &#39;TINYINT&#39;&gt;, &lt;Type.UINT128: &#39;UINT128&#39;&gt;, &lt;Type.UINT256: &#39;UINT256&#39;&gt;, &lt;Type.SMALLINT: &#39;SMALLINT&#39;&gt;, &lt;Type.INT128: &#39;INT128&#39;&gt;, &lt;Type.UMEDIUMINT: &#39;UMEDIUMINT&#39;&gt;, &lt;Type.UTINYINT: &#39;UTINYINT&#39;&gt;, &lt;Type.UBIGINT: &#39;UBIGINT&#39;&gt;, &lt;Type.BIT: &#39;BIT&#39;&gt;, &lt;Type.MEDIUMINT: &#39;MEDIUMINT&#39;&gt;, &lt;Type.INT256: &#39;INT256&#39;&gt;}</span>
<label class="view-value-button pdoc-button" for="DataType.INTEGER_TYPES-view-value"></label><span class="default_value">{&lt;Type.INT256: &#39;INT256&#39;&gt;, &lt;Type.SMALLINT: &#39;SMALLINT&#39;&gt;, &lt;Type.BIGINT: &#39;BIGINT&#39;&gt;, &lt;Type.UBIGINT: &#39;UBIGINT&#39;&gt;, &lt;Type.UINT256: &#39;UINT256&#39;&gt;, &lt;Type.UTINYINT: &#39;UTINYINT&#39;&gt;, &lt;Type.INT128: &#39;INT128&#39;&gt;, &lt;Type.UMEDIUMINT: &#39;UMEDIUMINT&#39;&gt;, &lt;Type.MEDIUMINT: &#39;MEDIUMINT&#39;&gt;, &lt;Type.UINT128: &#39;UINT128&#39;&gt;, &lt;Type.USMALLINT: &#39;USMALLINT&#39;&gt;, &lt;Type.TINYINT: &#39;TINYINT&#39;&gt;, &lt;Type.INT: &#39;INT&#39;&gt;, &lt;Type.BIT: &#39;BIT&#39;&gt;, &lt;Type.UINT: &#39;UINT&#39;&gt;}</span>
</div>
@ -59082,7 +59082,7 @@ Otherwise, this resets the expressions.</li>
<div class="attr variable">
<span class="name">REAL_TYPES</span> =
<input id="DataType.REAL_TYPES-view-value" class="view-value-toggle-state" type="checkbox" aria-hidden="true" tabindex="-1">
<label class="view-value-button pdoc-button" for="DataType.REAL_TYPES-view-value"></label><span class="default_value">{&lt;Type.MONEY: &#39;MONEY&#39;&gt;, &lt;Type.SMALLMONEY: &#39;SMALLMONEY&#39;&gt;, &lt;Type.DECIMAL128: &#39;DECIMAL128&#39;&gt;, &lt;Type.DECIMAL: &#39;DECIMAL&#39;&gt;, &lt;Type.FLOAT: &#39;FLOAT&#39;&gt;, &lt;Type.DECIMAL256: &#39;DECIMAL256&#39;&gt;, &lt;Type.BIGDECIMAL: &#39;BIGDECIMAL&#39;&gt;, &lt;Type.DECIMAL32: &#39;DECIMAL32&#39;&gt;, &lt;Type.DOUBLE: &#39;DOUBLE&#39;&gt;, &lt;Type.UDECIMAL: &#39;UDECIMAL&#39;&gt;, &lt;Type.DECIMAL64: &#39;DECIMAL64&#39;&gt;}</span>
<label class="view-value-button pdoc-button" for="DataType.REAL_TYPES-view-value"></label><span class="default_value">{&lt;Type.DECIMAL256: &#39;DECIMAL256&#39;&gt;, &lt;Type.FLOAT: &#39;FLOAT&#39;&gt;, &lt;Type.DECIMAL32: &#39;DECIMAL32&#39;&gt;, &lt;Type.SMALLMONEY: &#39;SMALLMONEY&#39;&gt;, &lt;Type.UDECIMAL: &#39;UDECIMAL&#39;&gt;, &lt;Type.BIGDECIMAL: &#39;BIGDECIMAL&#39;&gt;, &lt;Type.DECIMAL: &#39;DECIMAL&#39;&gt;, &lt;Type.DECIMAL64: &#39;DECIMAL64&#39;&gt;, &lt;Type.DECIMAL128: &#39;DECIMAL128&#39;&gt;, &lt;Type.MONEY: &#39;MONEY&#39;&gt;, &lt;Type.DOUBLE: &#39;DOUBLE&#39;&gt;}</span>
</div>
@ -59095,7 +59095,7 @@ Otherwise, this resets the expressions.</li>
<div class="attr variable">
<span class="name">NUMERIC_TYPES</span> =
<input id="DataType.NUMERIC_TYPES-view-value" class="view-value-toggle-state" type="checkbox" aria-hidden="true" tabindex="-1">
<label class="view-value-button pdoc-button" for="DataType.NUMERIC_TYPES-view-value"></label><span class="default_value">{&lt;Type.UINT: &#39;UINT&#39;&gt;, &lt;Type.USMALLINT: &#39;USMALLINT&#39;&gt;, &lt;Type.INT: &#39;INT&#39;&gt;, &lt;Type.MONEY: &#39;MONEY&#39;&gt;, &lt;Type.SMALLMONEY: &#39;SMALLMONEY&#39;&gt;, &lt;Type.DECIMAL128: &#39;DECIMAL128&#39;&gt;, &lt;Type.UINT128: &#39;UINT128&#39;&gt;, &lt;Type.INT128: &#39;INT128&#39;&gt;, &lt;Type.UMEDIUMINT: &#39;UMEDIUMINT&#39;&gt;, &lt;Type.UTINYINT: &#39;UTINYINT&#39;&gt;, &lt;Type.DECIMAL: &#39;DECIMAL&#39;&gt;, &lt;Type.DECIMAL256: &#39;DECIMAL256&#39;&gt;, &lt;Type.BIGDECIMAL: &#39;BIGDECIMAL&#39;&gt;, &lt;Type.UDECIMAL: &#39;UDECIMAL&#39;&gt;, &lt;Type.TINYINT: &#39;TINYINT&#39;&gt;, &lt;Type.BIGINT: &#39;BIGINT&#39;&gt;, &lt;Type.DECIMAL64: &#39;DECIMAL64&#39;&gt;, &lt;Type.UINT256: &#39;UINT256&#39;&gt;, &lt;Type.SMALLINT: &#39;SMALLINT&#39;&gt;, &lt;Type.BIT: &#39;BIT&#39;&gt;, &lt;Type.UBIGINT: &#39;UBIGINT&#39;&gt;, &lt;Type.FLOAT: &#39;FLOAT&#39;&gt;, &lt;Type.DECIMAL32: &#39;DECIMAL32&#39;&gt;, &lt;Type.DOUBLE: &#39;DOUBLE&#39;&gt;, &lt;Type.MEDIUMINT: &#39;MEDIUMINT&#39;&gt;, &lt;Type.INT256: &#39;INT256&#39;&gt;}</span>
<label class="view-value-button pdoc-button" for="DataType.NUMERIC_TYPES-view-value"></label><span class="default_value">{&lt;Type.DECIMAL256: &#39;DECIMAL256&#39;&gt;, &lt;Type.FLOAT: &#39;FLOAT&#39;&gt;, &lt;Type.DECIMAL32: &#39;DECIMAL32&#39;&gt;, &lt;Type.SMALLMONEY: &#39;SMALLMONEY&#39;&gt;, &lt;Type.INT256: &#39;INT256&#39;&gt;, &lt;Type.UBIGINT: &#39;UBIGINT&#39;&gt;, &lt;Type.UINT256: &#39;UINT256&#39;&gt;, &lt;Type.UDECIMAL: &#39;UDECIMAL&#39;&gt;, &lt;Type.DECIMAL: &#39;DECIMAL&#39;&gt;, &lt;Type.DECIMAL64: &#39;DECIMAL64&#39;&gt;, &lt;Type.MEDIUMINT: &#39;MEDIUMINT&#39;&gt;, &lt;Type.UINT128: &#39;UINT128&#39;&gt;, &lt;Type.USMALLINT: &#39;USMALLINT&#39;&gt;, &lt;Type.DECIMAL128: &#39;DECIMAL128&#39;&gt;, &lt;Type.UINT: &#39;UINT&#39;&gt;, &lt;Type.SMALLINT: &#39;SMALLINT&#39;&gt;, &lt;Type.BIGINT: &#39;BIGINT&#39;&gt;, &lt;Type.UTINYINT: &#39;UTINYINT&#39;&gt;, &lt;Type.BIGDECIMAL: &#39;BIGDECIMAL&#39;&gt;, &lt;Type.INT128: &#39;INT128&#39;&gt;, &lt;Type.UMEDIUMINT: &#39;UMEDIUMINT&#39;&gt;, &lt;Type.MONEY: &#39;MONEY&#39;&gt;, &lt;Type.TINYINT: &#39;TINYINT&#39;&gt;, &lt;Type.INT: &#39;INT&#39;&gt;, &lt;Type.BIT: &#39;BIT&#39;&gt;, &lt;Type.DOUBLE: &#39;DOUBLE&#39;&gt;}</span>
</div>
@ -59108,7 +59108,7 @@ Otherwise, this resets the expressions.</li>
<div class="attr variable">
<span class="name">TEMPORAL_TYPES</span> =
<input id="DataType.TEMPORAL_TYPES-view-value" class="view-value-toggle-state" type="checkbox" aria-hidden="true" tabindex="-1">
<label class="view-value-button pdoc-button" for="DataType.TEMPORAL_TYPES-view-value"></label><span class="default_value">{&lt;Type.DATETIME64: &#39;DATETIME64&#39;&gt;, &lt;Type.TIMESTAMP_S: &#39;TIMESTAMP_S&#39;&gt;, &lt;Type.DATE: &#39;DATE&#39;&gt;, &lt;Type.DATE32: &#39;DATE32&#39;&gt;, &lt;Type.TIMESTAMPTZ: &#39;TIMESTAMPTZ&#39;&gt;, &lt;Type.TIMESTAMPNTZ: &#39;TIMESTAMPNTZ&#39;&gt;, &lt;Type.TIMESTAMP_MS: &#39;TIMESTAMP_MS&#39;&gt;, &lt;Type.TIMESTAMPLTZ: &#39;TIMESTAMPLTZ&#39;&gt;, &lt;Type.TIME: &#39;TIME&#39;&gt;, &lt;Type.DATETIME: &#39;DATETIME&#39;&gt;, &lt;Type.TIMESTAMP: &#39;TIMESTAMP&#39;&gt;, &lt;Type.TIMESTAMP_NS: &#39;TIMESTAMP_NS&#39;&gt;, &lt;Type.TIMETZ: &#39;TIMETZ&#39;&gt;}</span>
<label class="view-value-button pdoc-button" for="DataType.TEMPORAL_TYPES-view-value"></label><span class="default_value">{&lt;Type.TIMESTAMPTZ: &#39;TIMESTAMPTZ&#39;&gt;, &lt;Type.TIMETZ: &#39;TIMETZ&#39;&gt;, &lt;Type.DATETIME: &#39;DATETIME&#39;&gt;, &lt;Type.TIMESTAMPLTZ: &#39;TIMESTAMPLTZ&#39;&gt;, &lt;Type.TIMESTAMP_NS: &#39;TIMESTAMP_NS&#39;&gt;, &lt;Type.TIMESTAMP_S: &#39;TIMESTAMP_S&#39;&gt;, &lt;Type.TIMESTAMP: &#39;TIMESTAMP&#39;&gt;, &lt;Type.DATE: &#39;DATE&#39;&gt;, &lt;Type.TIME: &#39;TIME&#39;&gt;, &lt;Type.DATETIME64: &#39;DATETIME64&#39;&gt;, &lt;Type.TIMESTAMPNTZ: &#39;TIMESTAMPNTZ&#39;&gt;, &lt;Type.TIMESTAMP_MS: &#39;TIMESTAMP_MS&#39;&gt;, &lt;Type.DATE32: &#39;DATE32&#39;&gt;}</span>
</div>

View file

@ -11742,7 +11742,7 @@ Default: True</li>
<div id="Generator.PARAMETERIZABLE_TEXT_TYPES" class="classattr">
<div class="attr variable">
<span class="name">PARAMETERIZABLE_TEXT_TYPES</span> =
<span class="default_value">{&lt;Type.CHAR: &#39;CHAR&#39;&gt;, &lt;Type.NVARCHAR: &#39;NVARCHAR&#39;&gt;, &lt;Type.VARCHAR: &#39;VARCHAR&#39;&gt;, &lt;Type.NCHAR: &#39;NCHAR&#39;&gt;}</span>
<span class="default_value">{&lt;Type.VARCHAR: &#39;VARCHAR&#39;&gt;, &lt;Type.NVARCHAR: &#39;NVARCHAR&#39;&gt;, &lt;Type.CHAR: &#39;CHAR&#39;&gt;, &lt;Type.NCHAR: &#39;NCHAR&#39;&gt;}</span>
</div>

View file

@ -1874,7 +1874,7 @@ belong to some totally-ordered set.</p>
<section id="DATE_UNITS">
<div class="attr variable">
<span class="name">DATE_UNITS</span> =
<span class="default_value">{&#39;week&#39;, &#39;month&#39;, &#39;day&#39;, &#39;quarter&#39;, &#39;year_month&#39;, &#39;year&#39;}</span>
<span class="default_value">{&#39;year_month&#39;, &#39;day&#39;, &#39;month&#39;, &#39;quarter&#39;, &#39;week&#39;, &#39;year&#39;}</span>
</div>

File diff suppressed because one or more lines are too long

View file

@ -581,7 +581,7 @@ queries if it would result in multiple table selects in a single query:</p>
<div class="attr variable">
<span class="name">UNMERGABLE_ARGS</span> =
<input id="UNMERGABLE_ARGS-view-value" class="view-value-toggle-state" type="checkbox" aria-hidden="true" tabindex="-1">
<label class="view-value-button pdoc-button" for="UNMERGABLE_ARGS-view-value"></label><span class="default_value">{&#39;with&#39;, &#39;into&#39;, &#39;settings&#39;, &#39;prewhere&#39;, &#39;sort&#39;, &#39;options&#39;, &#39;pivots&#39;, &#39;having&#39;, &#39;laterals&#39;, &#39;distribute&#39;, &#39;offset&#39;, &#39;format&#39;, &#39;match&#39;, &#39;cluster&#39;, &#39;limit&#39;, &#39;distinct&#39;, &#39;locks&#39;, &#39;kind&#39;, &#39;sample&#39;, &#39;group&#39;, &#39;operation_modifiers&#39;, &#39;windows&#39;, &#39;qualify&#39;, &#39;connect&#39;}</span>
<label class="view-value-button pdoc-button" for="UNMERGABLE_ARGS-view-value"></label><span class="default_value">{&#39;sample&#39;, &#39;locks&#39;, &#39;prewhere&#39;, &#39;offset&#39;, &#39;pivots&#39;, &#39;windows&#39;, &#39;qualify&#39;, &#39;settings&#39;, &#39;into&#39;, &#39;laterals&#39;, &#39;connect&#39;, &#39;sort&#39;, &#39;options&#39;, &#39;kind&#39;, &#39;group&#39;, &#39;distribute&#39;, &#39;with&#39;, &#39;having&#39;, &#39;limit&#39;, &#39;format&#39;, &#39;match&#39;, &#39;cluster&#39;, &#39;distinct&#39;, &#39;operation_modifiers&#39;}</span>
</div>

View file

@ -3315,7 +3315,7 @@ prefix are statically known.</p>
<section id="JOINS">
<div class="attr variable">
<span class="name">JOINS</span> =
<span class="default_value">{(&#39;RIGHT&#39;, &#39;OUTER&#39;), (&#39;&#39;, &#39;INNER&#39;), (&#39;RIGHT&#39;, &#39;&#39;), (&#39;&#39;, &#39;&#39;)}</span>
<span class="default_value">{(&#39;RIGHT&#39;, &#39;&#39;), (&#39;RIGHT&#39;, &#39;OUTER&#39;), (&#39;&#39;, &#39;&#39;), (&#39;&#39;, &#39;INNER&#39;)}</span>
</div>

File diff suppressed because one or more lines are too long

File diff suppressed because one or more lines are too long

View file

@ -9075,7 +9075,7 @@
<div class="attr variable">
<span class="name">COMMANDS</span> =
<input id="Tokenizer.COMMANDS-view-value" class="view-value-toggle-state" type="checkbox" aria-hidden="true" tabindex="-1">
<label class="view-value-button pdoc-button" for="Tokenizer.COMMANDS-view-value"></label><span class="default_value">{&lt;<a href="#TokenType.SHOW">TokenType.SHOW</a>: &#39;SHOW&#39;&gt;, &lt;<a href="#TokenType.EXECUTE">TokenType.EXECUTE</a>: &#39;EXECUTE&#39;&gt;, &lt;<a href="#TokenType.FETCH">TokenType.FETCH</a>: &#39;FETCH&#39;&gt;, &lt;<a href="#TokenType.COMMAND">TokenType.COMMAND</a>: &#39;COMMAND&#39;&gt;, &lt;<a href="#TokenType.RENAME">TokenType.RENAME</a>: &#39;RENAME&#39;&gt;}</span>
<label class="view-value-button pdoc-button" for="Tokenizer.COMMANDS-view-value"></label><span class="default_value">{&lt;<a href="#TokenType.SHOW">TokenType.SHOW</a>: &#39;SHOW&#39;&gt;, &lt;<a href="#TokenType.COMMAND">TokenType.COMMAND</a>: &#39;COMMAND&#39;&gt;, &lt;<a href="#TokenType.FETCH">TokenType.FETCH</a>: &#39;FETCH&#39;&gt;, &lt;<a href="#TokenType.EXECUTE">TokenType.EXECUTE</a>: &#39;EXECUTE&#39;&gt;, &lt;<a href="#TokenType.RENAME">TokenType.RENAME</a>: &#39;RENAME&#39;&gt;}</span>
</div>
@ -9087,7 +9087,7 @@
<div id="Tokenizer.COMMAND_PREFIX_TOKENS" class="classattr">
<div class="attr variable">
<span class="name">COMMAND_PREFIX_TOKENS</span> =
<span class="default_value">{&lt;<a href="#TokenType.BEGIN">TokenType.BEGIN</a>: &#39;BEGIN&#39;&gt;, &lt;<a href="#TokenType.SEMICOLON">TokenType.SEMICOLON</a>: &#39;SEMICOLON&#39;&gt;}</span>
<span class="default_value">{&lt;<a href="#TokenType.SEMICOLON">TokenType.SEMICOLON</a>: &#39;SEMICOLON&#39;&gt;, &lt;<a href="#TokenType.BEGIN">TokenType.BEGIN</a>: &#39;BEGIN&#39;&gt;}</span>
</div>

File diff suppressed because it is too large Load diff

View file

@ -333,6 +333,7 @@ class BigQuery(Dialect):
HEX_LOWERCASE = True
FORCE_EARLY_ALIAS_REF_EXPANSION = True
EXPAND_ALIAS_REFS_EARLY_ONLY_IN_GROUP_BY = True
PRESERVE_ORIGINAL_NAMES = True
# https://cloud.google.com/bigquery/docs/reference/standard-sql/lexical#case_sensitivity
NORMALIZATION_STRATEGY = NormalizationStrategy.CASE_INSENSITIVE
@ -534,6 +535,7 @@ class BigQuery(Dialect):
**parser.Parser.FUNCTION_PARSERS,
"ARRAY": lambda self: self.expression(exp.Array, expressions=[self._parse_statement()]),
"MAKE_INTERVAL": lambda self: self._parse_make_interval(),
"FEATURES_AT_TIME": lambda self: self._parse_features_at_time(),
}
FUNCTION_PARSERS.pop("TRIM")
@ -764,7 +766,7 @@ class BigQuery(Dialect):
return unnest
def _parse_make_interval(self):
def _parse_make_interval(self) -> exp.MakeInterval:
expr = exp.MakeInterval()
for arg_key in expr.arg_types:
@ -784,6 +786,23 @@ class BigQuery(Dialect):
return expr
def _parse_features_at_time(self) -> exp.FeaturesAtTime:
expr = self.expression(
exp.FeaturesAtTime,
this=(self._match(TokenType.TABLE) and self._parse_table())
or self._parse_select(nested=True),
)
while self._match(TokenType.COMMA):
arg = self._parse_lambda()
# Get the LHS of the Kwarg and set the arg to that value, e.g
# "num_rows => 1" sets the expr's `num_rows` arg
if arg:
expr.set(arg.this.name, arg)
return expr
class Generator(generator.Generator):
INTERVAL_ALLOWS_PLURAL_FORM = False
JOIN_HINTS = False

View file

@ -165,6 +165,7 @@ class ClickHouse(Dialect):
SAFE_DIVISION = True
LOG_BASE_FIRST: t.Optional[bool] = None
FORCE_EARLY_ALIAS_REF_EXPANSION = True
PRESERVE_ORIGINAL_NAMES = True
# https://github.com/ClickHouse/ClickHouse/issues/33935#issue-1112165779
NORMALIZATION_STRATEGY = NormalizationStrategy.CASE_SENSITIVE

View file

@ -264,6 +264,13 @@ class Dialect(metaclass=_Dialect):
False: Disables function name normalization.
"""
PRESERVE_ORIGINAL_NAMES: bool = False
"""
Whether the name of the function should be preserved inside the node's metadata,
can be useful for roundtripping deprecated vs new functions that share an AST node
e.g JSON_VALUE vs JSON_EXTRACT_SCALAR in BigQuery
"""
LOG_BASE_FIRST: t.Optional[bool] = True
"""
Whether the base comes first in the `LOG` function.
@ -397,6 +404,13 @@ class Dialect(metaclass=_Dialect):
ARRAY_AGG_INCLUDES_NULLS: t.Optional[bool] = True
"""Whether ArrayAgg needs to filter NULL values."""
PROMOTE_TO_INFERRED_DATETIME_TYPE = False
"""
This flag is used in the optimizer's canonicalize rule and determines whether x will be promoted
to the literal's type in x::DATE < '2020-01-01 12:05:03' (i.e., DATETIME). When false, the literal
is cast to x's type to match it instead.
"""
REGEXP_EXTRACT_DEFAULT_GROUP = 0
"""The default value for the capturing group."""

View file

@ -26,6 +26,7 @@ def _str_to_date(self: Drill.Generator, expression: exp.StrToDate) -> str:
class Drill(Dialect):
NORMALIZE_FUNCTIONS: bool | str = False
PRESERVE_ORIGINAL_NAMES = True
NULL_ORDERING = "nulls_are_last"
DATE_FORMAT = "'yyyy-MM-dd'"
DATEINT_FORMAT = "'yyyyMMdd'"

View file

@ -9,7 +9,6 @@ from sqlglot.dialects.dialect import (
JSON_EXTRACT_TYPE,
NormalizationStrategy,
approx_count_distinct_sql,
arg_max_or_min_no_count,
arrow_json_extract_sql,
binary_from_function,
bool_xor_sql,
@ -310,13 +309,13 @@ class DuckDB(Dialect):
"^@": TokenType.CARET_AT,
"@>": TokenType.AT_GT,
"<@": TokenType.LT_AT,
"ATTACH": TokenType.COMMAND,
"ATTACH": TokenType.ATTACH,
"BINARY": TokenType.VARBINARY,
"BITSTRING": TokenType.BIT,
"BPCHAR": TokenType.TEXT,
"CHAR": TokenType.TEXT,
"CHARACTER VARYING": TokenType.TEXT,
"DETACH": TokenType.COMMAND,
"DETACH": TokenType.DETACH,
"EXCLUDE": TokenType.EXCEPT,
"LOGICAL": TokenType.BOOLEAN,
"ONLY": TokenType.ONLY,
@ -449,6 +448,12 @@ class DuckDB(Dialect):
exp.DataType.Type.TEXT: lambda dtype: exp.DataType.build("TEXT"),
}
STATEMENT_PARSERS = {
**parser.Parser.STATEMENT_PARSERS,
TokenType.ATTACH: lambda self: self._parse_attach_detach(),
TokenType.DETACH: lambda self: self._parse_attach_detach(is_attach=False),
}
def _parse_table_sample(self, as_modifier: bool = False) -> t.Optional[exp.TableSample]:
# https://duckdb.org/docs/sql/samples.html
sample = super()._parse_table_sample(as_modifier=as_modifier)
@ -484,6 +489,29 @@ class DuckDB(Dialect):
return super()._pivot_column_names(aggregations)
return pivot_column_names(aggregations, dialect="duckdb")
def _parse_attach_detach(self, is_attach=True) -> exp.Attach | exp.Detach:
def _parse_attach_option() -> exp.AttachOption:
return self.expression(
exp.AttachOption,
this=self._parse_var(any_token=True),
expression=self._parse_field(any_token=True),
)
self._match(TokenType.DATABASE)
exists = self._parse_exists(not_=is_attach)
this = self._parse_alias(self._parse_primary_or_var(), explicit=True)
if self._match(TokenType.L_PAREN, advance=False):
expressions = self._parse_wrapped_csv(_parse_attach_option)
else:
expressions = None
return (
self.expression(exp.Attach, this=this, exists=exists, expressions=expressions)
if is_attach
else self.expression(exp.Detach, this=this, exists=exists)
)
class Generator(generator.Generator):
PARAMETER_TOKEN = "$"
NAMED_PLACEHOLDER_TOKEN = "$"
@ -516,8 +544,6 @@ class DuckDB(Dialect):
exp.ApproxDistinct: approx_count_distinct_sql,
exp.Array: inline_array_unless_query,
exp.ArrayFilter: rename_func("LIST_FILTER"),
exp.ArgMax: arg_max_or_min_no_count("ARG_MAX"),
exp.ArgMin: arg_max_or_min_no_count("ARG_MIN"),
exp.ArraySort: _array_sort_sql,
exp.ArraySum: rename_func("LIST_SUM"),
exp.BitwiseXor: rename_func("XOR"),

View file

@ -145,6 +145,8 @@ def _remove_ts_or_ds_to_date(
class MySQL(Dialect):
PROMOTE_TO_INFERRED_DATETIME_TYPE = True
# https://dev.mysql.com/doc/refman/8.0/en/identifiers.html
IDENTIFIERS_CAN_START_WITH_DIGIT = True
@ -292,6 +294,8 @@ class MySQL(Dialect):
FUNCTIONS = {
**parser.Parser.FUNCTIONS,
"CHAR_LENGTH": exp.Length.from_arg_list,
"CHARACTER_LENGTH": exp.Length.from_arg_list,
"CONVERT_TZ": lambda args: exp.ConvertTimezone(
source_tz=seq_get(args, 1), target_tz=seq_get(args, 2), timestamp=seq_get(args, 0)
),
@ -725,6 +729,7 @@ class MySQL(Dialect):
e: f"""GROUP_CONCAT({self.sql(e, "this")} SEPARATOR {self.sql(e, "separator") or "','"})""",
exp.ILike: no_ilike_sql,
exp.JSONExtractScalar: arrow_json_extract_sql,
exp.Length: rename_func("CHAR_LENGTH"),
exp.Max: max_or_greatest,
exp.Min: min_or_least,
exp.Month: _remove_ts_or_ds_to_date(),

View file

@ -15,7 +15,6 @@ from sqlglot.dialects.dialect import (
from sqlglot.helper import seq_get
from sqlglot.parser import OPTIONS_TYPE, build_coalesce
from sqlglot.tokens import TokenType
from sqlglot.errors import ParseError
if t.TYPE_CHECKING:
from sqlglot._typing import E
@ -207,35 +206,6 @@ class Oracle(Dialect):
**kwargs,
)
def _parse_hint(self) -> t.Optional[exp.Hint]:
start_index = self._index
should_fallback_to_string = False
if not self._match(TokenType.HINT):
return None
hints = []
try:
for hint in iter(
lambda: self._parse_csv(
lambda: self._parse_hint_function_call() or self._parse_var(upper=True),
),
[],
):
hints.extend(hint)
except ParseError:
should_fallback_to_string = True
if not self._match_pair(TokenType.STAR, TokenType.SLASH):
should_fallback_to_string = True
if should_fallback_to_string:
self._retreat(start_index)
return self._parse_hint_fallback_to_string()
return self.expression(exp.Hint, expressions=hints)
def _parse_hint_function_call(self) -> t.Optional[exp.Expression]:
if not self._curr or not self._next or self._next.token_type != TokenType.L_PAREN:
return None
@ -258,20 +228,6 @@ class Oracle(Dialect):
return args
def _parse_hint_fallback_to_string(self) -> t.Optional[exp.Hint]:
if self._match(TokenType.HINT):
start = self._curr
while self._curr and not self._match_pair(TokenType.STAR, TokenType.SLASH):
self._advance()
if not self._curr:
self.raise_error("Expected */ after HINT")
end = self._tokens[self._index - 3]
return exp.Hint(expressions=[self._find_sql(start, end)])
return None
def _parse_query_restrictions(self) -> t.Optional[exp.Expression]:
kind = self._parse_var_from_options(self.QUERY_RESTRICTIONS, raise_unmatched=False)

View file

@ -230,6 +230,10 @@ class Presto(Dialect):
KEYWORDS = {
**tokens.Tokenizer.KEYWORDS,
"DEALLOCATE PREPARE": TokenType.COMMAND,
"DESCRIBE INPUT": TokenType.COMMAND,
"DESCRIBE OUTPUT": TokenType.COMMAND,
"RESET SESSION": TokenType.COMMAND,
"START": TokenType.BEGIN,
"MATCH_RECOGNIZE": TokenType.MATCH_RECOGNIZE,
"ROW": TokenType.STRUCT,

View file

@ -1,6 +1,74 @@
from __future__ import annotations
from sqlglot.dialects.postgres import Postgres
from sqlglot.tokens import TokenType
import typing as t
from sqlglot import exp
class RisingWave(Postgres):
class Tokenizer(Postgres.Tokenizer):
KEYWORDS = {
**Postgres.Tokenizer.KEYWORDS,
"SINK": TokenType.SINK,
"SOURCE": TokenType.SOURCE,
}
class Parser(Postgres.Parser):
WRAPPED_TRANSFORM_COLUMN_CONSTRAINT = False
PROPERTY_PARSERS = {
**Postgres.Parser.PROPERTY_PARSERS,
"ENCODE": lambda self: self._parse_encode_property(),
"INCLUDE": lambda self: self._parse_include_property(),
"KEY": lambda self: self._parse_encode_property(key=True),
}
def _parse_table_hints(self) -> t.Optional[t.List[exp.Expression]]:
# There is no hint in risingwave.
# Do nothing here to avoid WITH keywords conflict in CREATE SINK statement.
return None
def _parse_include_property(self) -> t.Optional[exp.Expression]:
header: t.Optional[exp.Expression] = None
coldef: t.Optional[exp.Expression] = None
this = self._parse_var_or_string()
if not self._match(TokenType.ALIAS):
header = self._parse_field()
if header:
coldef = self.expression(exp.ColumnDef, this=header, kind=self._parse_types())
self._match(TokenType.ALIAS)
alias = self._parse_id_var(tokens=self.ALIAS_TOKENS)
return self.expression(exp.IncludeProperty, this=this, alias=alias, column_def=coldef)
def _parse_encode_property(self, key: t.Optional[bool] = None) -> exp.EncodeProperty:
self._match_text_seq("ENCODE")
this = self._parse_var_or_string()
if self._match(TokenType.L_PAREN, advance=False):
properties = self.expression(
exp.Properties, expressions=self._parse_wrapped_properties()
)
else:
properties = None
return self.expression(exp.EncodeProperty, this=this, properties=properties, key=key)
class Generator(Postgres.Generator):
LOCKING_READS_SUPPORTED = False
TRANSFORMS = {
**Postgres.Generator.TRANSFORMS,
exp.FileFormatProperty: lambda self, e: f"FORMAT {self.sql(e, 'this')}",
}
PROPERTIES_LOCATION = {
**Postgres.Generator.PROPERTIES_LOCATION,
exp.FileFormatProperty: exp.Properties.Location.POST_EXPRESSION,
}
EXPRESSION_PRECEDES_PROPERTIES_CREATABLES = {"SINK"}

View file

@ -198,21 +198,14 @@ def _flatten_structured_types_unless_iceberg(expression: exp.Expression) -> exp.
return expression
def _unnest_generate_date_array(expression: exp.Expression) -> exp.Expression:
if isinstance(expression, exp.Select):
for unnest in expression.find_all(exp.Unnest):
if (
isinstance(unnest.parent, (exp.From, exp.Join))
and len(unnest.expressions) == 1
and isinstance(unnest.expressions[0], exp.GenerateDateArray)
):
def _unnest_generate_date_array(unnest: exp.Unnest) -> None:
generate_date_array = unnest.expressions[0]
start = generate_date_array.args.get("start")
end = generate_date_array.args.get("end")
step = generate_date_array.args.get("step")
if not start or not end or not isinstance(step, exp.Interval) or step.name != "1":
continue
return
unit = step.args.get("unit")
@ -236,6 +229,28 @@ def _unnest_generate_date_array(expression: exp.Expression) -> exp.Expression:
unnest.set("expressions", [number_sequence])
unnest.replace(exp.select(date_add).from_(unnest.copy()).subquery(unnest_alias))
def _transform_generate_date_array(expression: exp.Expression) -> exp.Expression:
if isinstance(expression, exp.Select):
for generate_date_array in expression.find_all(exp.GenerateDateArray):
parent = generate_date_array.parent
# If GENERATE_DATE_ARRAY is used directly as an array (e.g passed into ARRAY_LENGTH), the transformed Snowflake
# query is the following (it'll be unnested properly on the next iteration due to copy):
# SELECT ref(GENERATE_DATE_ARRAY(...)) -> SELECT ref((SELECT ARRAY_AGG(*) FROM UNNEST(GENERATE_DATE_ARRAY(...))))
if not isinstance(parent, exp.Unnest):
unnest = exp.Unnest(expressions=[generate_date_array.copy()])
generate_date_array.replace(
exp.select(exp.ArrayAgg(this=exp.Star())).from_(unnest).subquery()
)
if (
isinstance(parent, exp.Unnest)
and isinstance(parent.parent, (exp.From, exp.Join))
and len(parent.expressions) == 1
):
_unnest_generate_date_array(parent)
return expression
@ -465,6 +480,7 @@ class Snowflake(Dialect):
PROPERTY_PARSERS = {
**parser.Parser.PROPERTY_PARSERS,
"LOCATION": lambda self: self._parse_location_property(),
"TAG": lambda self: self._parse_tag(),
}
TYPE_CONVERTERS = {
@ -546,6 +562,12 @@ class Snowflake(Dialect):
return self.expression(exp.Not, this=this)
def _parse_tag(self) -> exp.Tags:
return self.expression(
exp.Tags,
expressions=self._parse_wrapped_csv(self._parse_property),
)
def _parse_with_constraint(self) -> t.Optional[exp.Expression]:
if self._prev.token_type != TokenType.WITH:
self._retreat(self._index - 1)
@ -565,13 +587,16 @@ class Snowflake(Dialect):
this=policy.to_dot() if isinstance(policy, exp.Column) else policy,
)
if self._match(TokenType.TAG):
return self.expression(
exp.TagColumnConstraint,
expressions=self._parse_wrapped_csv(self._parse_property),
)
return self._parse_tag()
return None
def _parse_with_property(self) -> t.Optional[exp.Expression] | t.List[exp.Expression]:
if self._match(TokenType.TAG):
return self._parse_tag()
return super()._parse_with_property()
def _parse_create(self) -> exp.Create | exp.Command:
expression = super()._parse_create()
if isinstance(expression, exp.Create) and expression.kind in self.NON_TABLE_CREATABLES:
@ -893,7 +918,7 @@ class Snowflake(Dialect):
transforms.eliminate_distinct_on,
transforms.explode_to_unnest(),
transforms.eliminate_semi_and_anti_joins,
_unnest_generate_date_array,
_transform_generate_date_array,
]
),
exp.SafeDivide: lambda self, e: no_safe_divide_sql(self, e, "IFF"),

View file

@ -178,7 +178,7 @@ def _build_hashbytes(args: t.List) -> exp.Expression:
return exp.func("HASHBYTES", *args)
DATEPART_ONLY_FORMATS = {"DW", "HOUR", "QUARTER"}
DATEPART_ONLY_FORMATS = {"DW", "WK", "HOUR", "QUARTER"}
def _format_sql(self: TSQL.Generator, expression: exp.NumberToStr | exp.TimeToStr) -> str:
@ -398,8 +398,8 @@ class TSQL(Dialect):
"s": "%-S",
"millisecond": "%f",
"ms": "%f",
"weekday": "%W",
"dw": "%W",
"weekday": "%w",
"dw": "%w",
"month": "%m",
"mm": "%M",
"m": "%-M",

View file

@ -295,7 +295,7 @@ class Expression(metaclass=_Expression):
return root
def copy(self):
def copy(self) -> Self:
"""
Returns a deep copy of the expression.
"""
@ -1476,9 +1476,20 @@ class Describe(Expression):
"kind": False,
"expressions": False,
"partition": False,
"format": False,
}
# https://duckdb.org/docs/sql/statements/attach.html#attach
class Attach(Expression):
arg_types = {"this": True, "exists": False, "expressions": False}
# https://duckdb.org/docs/sql/statements/attach.html#detach
class Detach(Expression):
arg_types = {"this": True, "exists": False}
# https://duckdb.org/docs/guides/meta/summarize.html
class Summarize(Expression):
arg_types = {"this": True, "table": False}
@ -1897,11 +1908,6 @@ class OnUpdateColumnConstraint(ColumnConstraintKind):
pass
# https://docs.snowflake.com/en/sql-reference/sql/create-table
class TagColumnConstraint(ColumnConstraintKind):
arg_types = {"expressions": True}
# https://docs.snowflake.com/en/sql-reference/sql/create-external-table#optional-parameters
class TransformColumnConstraint(ColumnConstraintKind):
pass
@ -1923,6 +1929,11 @@ class UppercaseColumnConstraint(ColumnConstraintKind):
arg_types: t.Dict[str, t.Any] = {}
# https://docs.risingwave.com/processing/watermarks#syntax
class WatermarkColumnConstraint(Expression):
arg_types = {"this": True, "expression": True}
class PathColumnConstraint(ColumnConstraintKind):
pass
@ -2966,6 +2977,11 @@ class SecureProperty(Property):
arg_types = {}
# https://docs.snowflake.com/en/sql-reference/sql/create-table
class Tags(ColumnConstraintKind, Property):
arg_types = {"expressions": True}
class TransformModelProperty(Property):
arg_types = {"expressions": True}
@ -3013,6 +3029,14 @@ class WithProcedureOptions(Property):
arg_types = {"expressions": True}
class EncodeProperty(Property):
arg_types = {"this": True, "properties": False, "key": False}
class IncludeProperty(Property):
arg_types = {"this": True, "alias": False, "column_def": False}
class Properties(Expression):
arg_types = {"expressions": True}
@ -3037,6 +3061,8 @@ class Properties(Expression):
"RETURNS": ReturnsProperty,
"ROW_FORMAT": RowFormatProperty,
"SORTKEY": SortKeyProperty,
"ENCODE": EncodeProperty,
"INCLUDE": IncludeProperty,
}
PROPERTY_TO_NAME = {v: k for k, v in NAME_TO_PROPERTY.items()}
@ -4662,6 +4688,10 @@ class AddConstraint(Expression):
arg_types = {"expressions": True}
class AttachOption(Expression):
arg_types = {"this": True, "expression": False}
class DropPartition(Expression):
arg_types = {"expressions": True, "exists": False}
@ -5757,6 +5787,10 @@ class FromBase64(Func):
pass
class FeaturesAtTime(Func):
arg_types = {"this": True, "time": False, "num_rows": False, "ignore_feature_nulls": False}
class ToBase64(Func):
pass
@ -6034,6 +6068,7 @@ class JSONExtract(Binary, Func):
class JSONExtractArray(Func):
arg_types = {"this": True, "expression": False}
_sql_names = ["JSON_EXTRACT_ARRAY"]
class JSONExtractScalar(Binary, Func):
@ -7832,6 +7867,7 @@ def cast(
types_are_equivalent = type_mapping.get(
existing_cast_type, existing_cast_type.value
) == type_mapping.get(new_cast_type, new_cast_type.value)
if expr.is_type(data_type) or types_are_equivalent:
return expr

View file

@ -188,7 +188,7 @@ class Generator(metaclass=_Generator):
exp.StrictProperty: lambda *_: "STRICT",
exp.SwapTable: lambda self, e: f"SWAP WITH {self.sql(e, 'this')}",
exp.TemporaryProperty: lambda *_: "TEMPORARY",
exp.TagColumnConstraint: lambda self, e: f"TAG ({self.expressions(e, flat=True)})",
exp.Tags: lambda self, e: f"TAG ({self.expressions(e, flat=True)})",
exp.TitleColumnConstraint: lambda self, e: f"TITLE {self.sql(e, 'this')}",
exp.ToMap: lambda self, e: f"MAP {self.sql(e, 'this')}",
exp.ToTableProperty: lambda self, e: f"TO {self.sql(e.this)}",
@ -496,6 +496,8 @@ class Generator(metaclass=_Generator):
PARAMETER_TOKEN = "@"
NAMED_PLACEHOLDER_TOKEN = ":"
EXPRESSION_PRECEDES_PROPERTIES_CREATABLES: t.Set[str] = set()
PROPERTIES_LOCATION = {
exp.AllowedValuesProperty: exp.Properties.Location.POST_SCHEMA,
exp.AlgorithmProperty: exp.Properties.Location.POST_CREATE,
@ -520,6 +522,7 @@ class Generator(metaclass=_Generator):
exp.DistKeyProperty: exp.Properties.Location.POST_SCHEMA,
exp.DistStyleProperty: exp.Properties.Location.POST_SCHEMA,
exp.EmptyProperty: exp.Properties.Location.POST_SCHEMA,
exp.EncodeProperty: exp.Properties.Location.POST_EXPRESSION,
exp.EngineProperty: exp.Properties.Location.POST_SCHEMA,
exp.ExecuteAsProperty: exp.Properties.Location.POST_SCHEMA,
exp.ExternalProperty: exp.Properties.Location.POST_CREATE,
@ -530,6 +533,7 @@ class Generator(metaclass=_Generator):
exp.HeapProperty: exp.Properties.Location.POST_WITH,
exp.InheritsProperty: exp.Properties.Location.POST_SCHEMA,
exp.IcebergProperty: exp.Properties.Location.POST_CREATE,
exp.IncludeProperty: exp.Properties.Location.POST_SCHEMA,
exp.InputModelProperty: exp.Properties.Location.POST_SCHEMA,
exp.IsolatedLoadingProperty: exp.Properties.Location.POST_NAME,
exp.JournalProperty: exp.Properties.Location.POST_NAME,
@ -572,6 +576,7 @@ class Generator(metaclass=_Generator):
exp.StabilityProperty: exp.Properties.Location.POST_SCHEMA,
exp.StreamingTableProperty: exp.Properties.Location.POST_CREATE,
exp.StrictProperty: exp.Properties.Location.POST_SCHEMA,
exp.Tags: exp.Properties.Location.POST_WITH,
exp.TemporaryProperty: exp.Properties.Location.POST_CREATE,
exp.ToTableProperty: exp.Properties.Location.POST_SCHEMA,
exp.TransientProperty: exp.Properties.Location.POST_CREATE,
@ -1154,7 +1159,12 @@ class Generator(metaclass=_Generator):
clone = self.sql(expression, "clone")
clone = f" {clone}" if clone else ""
expression_sql = f"CREATE{modifiers} {kind}{concurrently}{exists_sql} {this}{properties_sql}{expression_sql}{postexpression_props_sql}{index_sql}{no_schema_binding}{clone}"
if kind in self.EXPRESSION_PRECEDES_PROPERTIES_CREATABLES:
properties_expression = f"{expression_sql}{properties_sql}"
else:
properties_expression = f"{properties_sql}{expression_sql}"
expression_sql = f"CREATE{modifiers} {kind}{concurrently}{exists_sql} {this}{properties_expression}{postexpression_props_sql}{index_sql}{no_schema_binding}{clone}"
return self.prepend_ctes(expression, expression_sql)
def sequenceproperties_sql(self, expression: exp.SequenceProperties) -> str:
@ -1193,7 +1203,10 @@ class Generator(metaclass=_Generator):
style = f" {style}" if style else ""
partition = self.sql(expression, "partition")
partition = f" {partition}" if partition else ""
return f"DESCRIBE{style} {self.sql(expression, 'this')}{partition}"
format = self.sql(expression, "format")
format = f" {format}" if format else ""
return f"DESCRIBE{style}{format} {self.sql(expression, 'this')}{partition}"
def heredoc_sql(self, expression: exp.Heredoc) -> str:
tag = self.sql(expression, "tag")
@ -3519,10 +3532,10 @@ class Generator(metaclass=_Generator):
elif arg_value is not None:
args.append(arg_value)
if self.normalize_functions:
name = expression.sql_name()
else:
if self.dialect.PRESERVE_ORIGINAL_NAMES:
name = (expression._meta and expression.meta.get("name")) or expression.sql_name()
else:
name = expression.sql_name()
return self.func(name, *args)
@ -4520,3 +4533,65 @@ class Generator(metaclass=_Generator):
dim = exp.Literal.number(1)
return self.func(self.ARRAY_SIZE_NAME, expression.this, dim)
def attach_sql(self, expression: exp.Attach) -> str:
this = self.sql(expression, "this")
exists_sql = " IF NOT EXISTS" if expression.args.get("exists") else ""
expressions = self.expressions(expression)
expressions = f" ({expressions})" if expressions else ""
return f"ATTACH{exists_sql} {this}{expressions}"
def detach_sql(self, expression: exp.Detach) -> str:
this = self.sql(expression, "this")
exists_sql = " IF EXISTS" if expression.args.get("exists") else ""
return f"DETACH{exists_sql} {this}"
def attachoption_sql(self, expression: exp.AttachOption) -> str:
this = self.sql(expression, "this")
value = self.sql(expression, "expression")
value = f" {value}" if value else ""
return f"{this}{value}"
def featuresattime_sql(self, expression: exp.FeaturesAtTime) -> str:
this_sql = self.sql(expression, "this")
if isinstance(expression.this, exp.Table):
this_sql = f"TABLE {this_sql}"
return self.func(
"FEATURES_AT_TIME",
this_sql,
expression.args.get("time"),
expression.args.get("num_rows"),
expression.args.get("ignore_feature_nulls"),
)
def watermarkcolumnconstraint_sql(self, expression: exp.WatermarkColumnConstraint) -> str:
return (
f"WATERMARK FOR {self.sql(expression, 'this')} AS {self.sql(expression, 'expression')}"
)
def encodeproperty_sql(self, expression: exp.EncodeProperty) -> str:
encode = "KEY ENCODE" if expression.args.get("key") else "ENCODE"
encode = f"{encode} {self.sql(expression, 'this')}"
properties = expression.args.get("properties")
if properties:
encode = f"{encode} {self.properties(properties)}"
return encode
def includeproperty_sql(self, expression: exp.IncludeProperty) -> str:
this = self.sql(expression, "this")
include = f"INCLUDE {this}"
column_def = self.sql(expression, "column_def")
if column_def:
include = f"{include} {column_def}"
alias = self.sql(expression, "alias")
if alias:
include = f"{include} AS {alias}"
return include

View file

@ -4,10 +4,12 @@ import itertools
import typing as t
from sqlglot import exp
from sqlglot.dialects.dialect import Dialect, DialectType
from sqlglot.helper import is_date_unit, is_iso_date, is_iso_datetime
from sqlglot.optimizer.annotate_types import TypeAnnotator
def canonicalize(expression: exp.Expression) -> exp.Expression:
def canonicalize(expression: exp.Expression, dialect: DialectType = None) -> exp.Expression:
"""Converts a sql expression into a standard form.
This method relies on annotate_types because many of the
@ -17,10 +19,12 @@ def canonicalize(expression: exp.Expression) -> exp.Expression:
expression: The expression to canonicalize.
"""
dialect = Dialect.get_or_raise(dialect)
def _canonicalize(expression: exp.Expression) -> exp.Expression:
expression = add_text_to_concat(expression)
expression = replace_date_funcs(expression)
expression = coerce_type(expression)
expression = coerce_type(expression, dialect.PROMOTE_TO_INFERRED_DATETIME_TYPE)
expression = remove_redundant_casts(expression)
expression = ensure_bools(expression, _replace_int_predicate)
expression = remove_ascending_order(expression)
@ -68,11 +72,11 @@ COERCIBLE_DATE_OPS = (
)
def coerce_type(node: exp.Expression) -> exp.Expression:
def coerce_type(node: exp.Expression, promote_to_inferred_datetime_type: bool) -> exp.Expression:
if isinstance(node, COERCIBLE_DATE_OPS):
_coerce_date(node.left, node.right)
_coerce_date(node.left, node.right, promote_to_inferred_datetime_type)
elif isinstance(node, exp.Between):
_coerce_date(node.this, node.args["low"])
_coerce_date(node.this, node.args["low"], promote_to_inferred_datetime_type)
elif isinstance(node, exp.Extract) and not node.expression.type.is_type(
*exp.DataType.TEMPORAL_TYPES
):
@ -128,17 +132,48 @@ def remove_ascending_order(expression: exp.Expression) -> exp.Expression:
return expression
def _coerce_date(a: exp.Expression, b: exp.Expression) -> None:
def _coerce_date(
a: exp.Expression,
b: exp.Expression,
promote_to_inferred_datetime_type: bool,
) -> None:
for a, b in itertools.permutations([a, b]):
if isinstance(b, exp.Interval):
a = _coerce_timeunit_arg(a, b.unit)
a_type = a.type
if (
a.type
and a.type.this in exp.DataType.TEMPORAL_TYPES
and b.type
and b.type.this in exp.DataType.TEXT_TYPES
not a_type
or a_type.this not in exp.DataType.TEMPORAL_TYPES
or not b.type
or b.type.this not in exp.DataType.TEXT_TYPES
):
_replace_cast(b, exp.DataType.Type.DATETIME)
continue
if promote_to_inferred_datetime_type:
if b.is_string:
date_text = b.name
if is_iso_date(date_text):
b_type = exp.DataType.Type.DATE
elif is_iso_datetime(date_text):
b_type = exp.DataType.Type.DATETIME
else:
b_type = a_type.this
else:
# If b is not a datetime string, we conservatively promote it to a DATETIME,
# in order to ensure there are no surprising truncations due to downcasting
b_type = exp.DataType.Type.DATETIME
target_type = (
b_type if b_type in TypeAnnotator.COERCES_TO.get(a_type.this, {}) else a_type
)
else:
target_type = a_type
if target_type != a_type:
_replace_cast(a, target_type)
_replace_cast(b, target_type)
def _coerce_timeunit_arg(arg: exp.Expression, unit: t.Optional[exp.Expression]) -> exp.Expression:
@ -168,7 +203,7 @@ def _coerce_datediff_args(node: exp.DateDiff) -> None:
e.replace(exp.cast(e.copy(), to=exp.DataType.Type.DATETIME))
def _replace_cast(node: exp.Expression, to: exp.DataType.Type) -> None:
def _replace_cast(node: exp.Expression, to: exp.DATA_TYPE) -> None:
node.replace(exp.cast(node.copy(), to=to))

View file

@ -524,7 +524,9 @@ def _expand_struct_stars(
this = field.this.copy()
root, *parts = [part.copy() for part in itertools.chain(dot_parts, [this])]
new_column = exp.column(
t.cast(exp.Identifier, root), table=dot_column.args.get("table"), fields=parts
t.cast(exp.Identifier, root),
table=dot_column.args.get("table"),
fields=t.cast(t.List[exp.Identifier], parts),
)
new_selections.append(alias(new_column, this, copy=False))

View file

@ -429,6 +429,8 @@ class Parser(metaclass=_Parser):
TokenType.VIEW,
TokenType.WAREHOUSE,
TokenType.STREAMLIT,
TokenType.SINK,
TokenType.SOURCE,
}
CREATABLES = {
@ -450,6 +452,7 @@ class Parser(metaclass=_Parser):
# Tokens that can represent identifiers
ID_VAR_TOKENS = {
TokenType.ALL,
TokenType.ATTACH,
TokenType.VAR,
TokenType.ANTI,
TokenType.APPLY,
@ -471,6 +474,7 @@ class Parser(metaclass=_Parser):
TokenType.DELETE,
TokenType.DESC,
TokenType.DESCRIBE,
TokenType.DETACH,
TokenType.DICTIONARY,
TokenType.DIV,
TokenType.END,
@ -754,6 +758,7 @@ class Parser(metaclass=_Parser):
exp.From: lambda self: self._parse_from(joins=True),
exp.Group: lambda self: self._parse_group(),
exp.Having: lambda self: self._parse_having(),
exp.Hint: lambda self: self._parse_hint_body(),
exp.Identifier: lambda self: self._parse_id_var(),
exp.Join: lambda self: self._parse_join(),
exp.Lambda: lambda self: self._parse_lambda(),
@ -1053,6 +1058,11 @@ class Parser(metaclass=_Parser):
"TTL": lambda self: self.expression(exp.MergeTreeTTL, expressions=[self._parse_bitwise()]),
"UNIQUE": lambda self: self._parse_unique(),
"UPPERCASE": lambda self: self.expression(exp.UppercaseColumnConstraint),
"WATERMARK": lambda self: self.expression(
exp.WatermarkColumnConstraint,
this=self._match(TokenType.FOR) and self._parse_column(),
expression=self._match(TokenType.ALIAS) and self._parse_disjunction(),
),
"WITH": lambda self: self.expression(
exp.Properties, expressions=self._parse_wrapped_properties()
),
@ -1087,6 +1097,7 @@ class Parser(metaclass=_Parser):
"PERIOD",
"PRIMARY KEY",
"UNIQUE",
"WATERMARK",
}
NO_PAREN_FUNCTION_PARSERS = {
@ -1356,6 +1367,9 @@ class Parser(metaclass=_Parser):
# Whether a PARTITION clause can follow a table reference
SUPPORTS_PARTITION_SELECTION = False
# Whether the `name AS expr` schema/column constraint requires parentheses around `expr`
WRAPPED_TRANSFORM_COLUMN_CONSTRAINT = True
__slots__ = (
"error_level",
"error_message_context",
@ -1909,6 +1923,8 @@ class Parser(metaclass=_Parser):
elif create_token.token_type == TokenType.VIEW:
if self._match_text_seq("WITH", "NO", "SCHEMA", "BINDING"):
no_schema_binding = True
elif create_token.token_type in (TokenType.SINK, TokenType.SOURCE):
extend_props(self._parse_properties())
shallow = self._match_text_seq("SHALLOW")
@ -2609,7 +2625,14 @@ class Parser(metaclass=_Parser):
if self._match(TokenType.DOT):
style = None
self._retreat(self._index - 2)
format = self._parse_property() if self._match(TokenType.FORMAT, advance=False) else None
if self._match_set(self.STATEMENT_PARSERS, advance=False):
this = self._parse_statement()
else:
this = self._parse_table(schema=True)
properties = self._parse_properties()
expressions = properties.expressions if properties else None
partition = self._parse_partition()
@ -2620,6 +2643,7 @@ class Parser(metaclass=_Parser):
kind=kind,
expressions=expressions,
partition=partition,
format=format,
)
def _parse_multitable_inserts(self, comments: t.Optional[t.List[str]]) -> exp.MultitableInserts:
@ -3216,22 +3240,43 @@ class Parser(metaclass=_Parser):
return this
def _parse_hint(self) -> t.Optional[exp.Hint]:
if self._match(TokenType.HINT):
def _parse_hint_fallback_to_string(self) -> t.Optional[exp.Hint]:
start = self._curr
while self._curr:
self._advance()
end = self._tokens[self._index - 1]
return exp.Hint(expressions=[self._find_sql(start, end)])
def _parse_hint_function_call(self) -> t.Optional[exp.Expression]:
return self._parse_function_call()
def _parse_hint_body(self) -> t.Optional[exp.Hint]:
start_index = self._index
should_fallback_to_string = False
hints = []
try:
for hint in iter(
lambda: self._parse_csv(
lambda: self._parse_function() or self._parse_var(upper=True)
lambda: self._parse_hint_function_call() or self._parse_var(upper=True),
),
[],
):
hints.extend(hint)
except ParseError:
should_fallback_to_string = True
if not self._match_pair(TokenType.STAR, TokenType.SLASH):
self.raise_error("Expected */ after HINT")
if should_fallback_to_string or self._curr:
self._retreat(start_index)
return self._parse_hint_fallback_to_string()
return self.expression(exp.Hint, expressions=hints)
def _parse_hint(self) -> t.Optional[exp.Hint]:
if self._match(TokenType.HINT) and self._prev_comments:
return exp.maybe_parse(self._prev_comments[0], into=exp.Hint, dialect=self.dialect)
return None
def _parse_into(self) -> t.Optional[exp.Into]:
@ -5109,9 +5154,8 @@ class Parser(metaclass=_Parser):
else:
field = self._parse_field(any_token=True, anonymous_func=True)
if isinstance(field, exp.Func) and this:
# bigquery allows function calls like x.y.count(...)
# SAFE.SUBSTR(...)
if isinstance(field, (exp.Func, exp.Window)) and this:
# BQ & snowflake allow function calls like x.y.count(...), SAFE.SUBSTR(...) etc
# https://cloud.google.com/bigquery/docs/reference/standard-sql/functions-reference#function_call_rules
this = exp.replace_tree(
this,
@ -5135,6 +5179,11 @@ class Parser(metaclass=_Parser):
db=this.args.get("table"),
catalog=this.args.get("db"),
)
elif isinstance(field, exp.Window):
# Move the exp.Dot's to the window's function
window_func = self.expression(exp.Dot, this=this, expression=field.this)
field.set("this", window_func)
this = field
else:
this = self.expression(exp.Dot, this=this, expression=field)
@ -5308,7 +5357,7 @@ class Parser(metaclass=_Parser):
func = function(args)
func = self.validate_expression(func, args)
if not self.dialect.NORMALIZE_FUNCTIONS:
if self.dialect.PRESERVE_ORIGINAL_NAMES:
func.meta["name"] = this
this = func
@ -5461,12 +5510,19 @@ class Parser(metaclass=_Parser):
not_null=self._match_pair(TokenType.NOT, TokenType.NULL),
)
constraints.append(self.expression(exp.ColumnConstraint, kind=constraint_kind))
elif kind and self._match_pair(TokenType.ALIAS, TokenType.L_PAREN, advance=False):
self._match(TokenType.ALIAS)
elif (
kind
and self._match(TokenType.ALIAS, advance=False)
and (
not self.WRAPPED_TRANSFORM_COLUMN_CONSTRAINT
or (self._next and self._next.token_type == TokenType.L_PAREN)
)
):
self._advance()
constraints.append(
self.expression(
exp.ColumnConstraint,
kind=exp.TransformColumnConstraint(this=self._parse_field()),
kind=exp.TransformColumnConstraint(this=self._parse_disjunction()),
)
)
@ -7104,7 +7160,6 @@ class Parser(metaclass=_Parser):
while True:
key = self._parse_id_var()
value = self._parse_primary()
if not key and value is None:
break
settings.append(self.expression(exp.DictSubProperty, this=key, value=value))

View file

@ -226,6 +226,7 @@ class TokenType(AutoName):
ARRAY = auto()
ASC = auto()
ASOF = auto()
ATTACH = auto()
AUTO_INCREMENT = auto()
BEGIN = auto()
BETWEEN = auto()
@ -254,6 +255,7 @@ class TokenType(AutoName):
DELETE = auto()
DESC = auto()
DESCRIBE = auto()
DETACH = auto()
DICTIONARY = auto()
DISTINCT = auto()
DISTRIBUTE_BY = auto()
@ -404,6 +406,8 @@ class TokenType(AutoName):
VERSION_SNAPSHOT = auto()
TIMESTAMP_SNAPSHOT = auto()
OPTION = auto()
SINK = auto()
SOURCE = auto()
_ALL_TOKEN_TYPES = list(TokenType)
@ -507,6 +511,8 @@ class _Tokenizer(type):
),
"{#": "#}", # Ensure Jinja comments are tokenized correctly in all dialects
}
if klass.HINT_START in klass.KEYWORDS:
klass._COMMENTS[klass.HINT_START] = "*/"
klass._KEYWORD_TRIE = new_trie(
key.upper()
@ -544,6 +550,10 @@ class _Tokenizer(type):
heredoc_tag_is_identifier=klass.HEREDOC_TAG_IS_IDENTIFIER,
string_escapes_allowed_in_raw_strings=klass.STRING_ESCAPES_ALLOWED_IN_RAW_STRINGS,
nested_comments=klass.NESTED_COMMENTS,
hint_start=klass.HINT_START,
tokens_preceding_hint={
_TOKEN_TYPE_TO_INDEX[v] for v in klass.TOKENS_PRECEDING_HINT
},
)
token_types = RsTokenTypeSettings(
bit_string=_TOKEN_TYPE_TO_INDEX[TokenType.BIT_STRING],
@ -559,6 +569,7 @@ class _Tokenizer(type):
string=_TOKEN_TYPE_TO_INDEX[TokenType.STRING],
var=_TOKEN_TYPE_TO_INDEX[TokenType.VAR],
heredoc_string_alternative=_TOKEN_TYPE_TO_INDEX[klass.HEREDOC_STRING_ALTERNATIVE],
hint=_TOKEN_TYPE_TO_INDEX[TokenType.HINT],
)
klass._RS_TOKENIZER = RsTokenizer(settings, token_types)
else:
@ -629,6 +640,10 @@ class Tokenizer(metaclass=_Tokenizer):
NESTED_COMMENTS = True
HINT_START = "/*+"
TOKENS_PRECEDING_HINT = {TokenType.SELECT, TokenType.INSERT, TokenType.UPDATE, TokenType.DELETE}
# Autofilled
_COMMENTS: t.Dict[str, str] = {}
_FORMAT_STRINGS: t.Dict[str, t.Tuple[str, TokenType]] = {}
@ -644,7 +659,7 @@ class Tokenizer(metaclass=_Tokenizer):
**{f"{prefix}%}}": TokenType.BLOCK_END for prefix in ("", "+", "-")},
**{f"{{{{{postfix}": TokenType.BLOCK_START for postfix in ("+", "-")},
**{f"{prefix}}}}}": TokenType.BLOCK_END for prefix in ("+", "-")},
"/*+": TokenType.HINT,
HINT_START: TokenType.HINT,
"==": TokenType.EQ,
"::": TokenType.DCOLON,
"||": TokenType.DPIPE,
@ -1228,6 +1243,13 @@ class Tokenizer(metaclass=_Tokenizer):
self._advance(alnum=True)
self._comments.append(self._text[comment_start_size:])
if (
comment_start == self.HINT_START
and self.tokens
and self.tokens[-1].token_type in self.TOKENS_PRECEDING_HINT
):
self._add(TokenType.HINT)
# Leading comment is attached to the succeeding token, whilst trailing comment to the preceding.
# Multiple consecutive comments are preserved by appending them to the current comments list.
if comment_start_line == self._prev_token_line:

2
sqlglotrs/Cargo.lock generated
View file

@ -136,7 +136,7 @@ dependencies = [
[[package]]
name = "sqlglotrs"
version = "0.2.14"
version = "0.3.0"
dependencies = [
"pyo3",
]

View file

@ -1,6 +1,6 @@
[package]
name = "sqlglotrs"
version = "0.2.14"
version = "0.3.0"
edition = "2021"
license = "MIT"

View file

@ -19,6 +19,7 @@ pub struct TokenTypeSettings {
pub string: TokenType,
pub var: TokenType,
pub heredoc_string_alternative: TokenType,
pub hint: TokenType,
}
#[pymethods]
@ -38,6 +39,7 @@ impl TokenTypeSettings {
string: TokenType,
var: TokenType,
heredoc_string_alternative: TokenType,
hint: TokenType,
) -> Self {
TokenTypeSettings {
bit_string,
@ -53,6 +55,7 @@ impl TokenTypeSettings {
string,
var,
heredoc_string_alternative,
hint,
}
}
}
@ -75,9 +78,11 @@ pub struct TokenizerSettings {
pub var_single_tokens: HashSet<char>,
pub commands: HashSet<TokenType>,
pub command_prefix_tokens: HashSet<TokenType>,
pub tokens_preceding_hint: HashSet<TokenType>,
pub heredoc_tag_is_identifier: bool,
pub string_escapes_allowed_in_raw_strings: bool,
pub nested_comments: bool,
pub hint_start: String,
}
#[pymethods]
@ -99,9 +104,11 @@ impl TokenizerSettings {
var_single_tokens: HashSet<String>,
commands: HashSet<TokenType>,
command_prefix_tokens: HashSet<TokenType>,
tokens_preceding_hint: HashSet<TokenType>,
heredoc_tag_is_identifier: bool,
string_escapes_allowed_in_raw_strings: bool,
nested_comments: bool,
hint_start: String,
) -> Self {
let to_char = |v: &String| {
if v.len() == 1 {
@ -150,9 +157,11 @@ impl TokenizerSettings {
var_single_tokens: var_single_tokens_native,
commands,
command_prefix_tokens,
tokens_preceding_hint,
heredoc_tag_is_identifier,
string_escapes_allowed_in_raw_strings,
nested_comments,
hint_start,
}
}
}

View file

@ -395,6 +395,12 @@ impl<'a> TokenizerState<'a> {
.push(self.text()[comment_start_size..].to_string());
}
if comment_start == self.settings.hint_start
&& self.tokens.last().is_some()
&& self.settings.tokens_preceding_hint.contains(&self.tokens.last().unwrap().token_type) {
self.add(self.token_types.hint, None)?;
}
// Leading comment is attached to the succeeding token, whilst trailing comment to the preceding.
// Multiple consecutive comments are preserved by appending them to the current comments list.
if Some(comment_start_line) == self.previous_token_line {

View file

@ -1640,6 +1640,11 @@ WHERE
},
)
self.validate_identity(
"SELECT * FROM ML.FEATURES_AT_TIME(TABLE mydataset.feature_table, time => '2022-06-11 10:00:00+00', num_rows => 1, ignore_feature_nulls => TRUE)"
)
self.validate_identity("SELECT * FROM ML.FEATURES_AT_TIME((SELECT 1), num_rows => 1)")
def test_errors(self):
with self.assertRaises(TokenError):
transpile("'\\'", read="bigquery")
@ -2145,27 +2150,37 @@ OPTIONS (
},
)
sql = f"""SELECT {func}('{{"name": "Jakob", "age": "6"}}', '$.age')"""
self.validate_all(
f"""SELECT {func}('{{"name": "Jakob", "age": "6"}}', '$.age')""",
sql,
write={
"bigquery": f"""SELECT {func}('{{"name": "Jakob", "age": "6"}}', '$.age')""",
"bigquery": sql,
"duckdb": """SELECT '{"name": "Jakob", "age": "6"}' ->> '$.age'""",
"snowflake": """SELECT JSON_EXTRACT_PATH_TEXT('{"name": "Jakob", "age": "6"}', 'age')""",
},
)
self.assertEqual(
self.parse_one(sql).sql("bigquery", normalize_functions="upper"), sql
)
def test_json_extract_array(self):
for func in ("JSON_QUERY_ARRAY", "JSON_EXTRACT_ARRAY"):
with self.subTest(f"Testing BigQuery's {func}"):
sql = f"""SELECT {func}('{{"fruits": [1, "oranges"]}}', '$.fruits')"""
self.validate_all(
f"""SELECT {func}('{{"fruits": [1, "oranges"]}}', '$.fruits')""",
sql,
write={
"bigquery": f"""SELECT {func}('{{"fruits": [1, "oranges"]}}', '$.fruits')""",
"bigquery": sql,
"duckdb": """SELECT CAST('{"fruits": [1, "oranges"]}' -> '$.fruits' AS JSON[])""",
"snowflake": """SELECT TRANSFORM(GET_PATH(PARSE_JSON('{"fruits": [1, "oranges"]}'), 'fruits'), x -> PARSE_JSON(TO_JSON(x)))""",
},
)
self.assertEqual(
self.parse_one(sql).sql("bigquery", normalize_functions="upper"), sql
)
def test_unix_seconds(self):
self.validate_all(
"SELECT UNIX_SECONDS('2008-12-25 15:30:00+00')",

View file

@ -2854,6 +2854,13 @@ FROM subquery2""",
},
)
self.validate_all(
"SELECT ARRAY_LENGTH(GENERATE_DATE_ARRAY(DATE '2020-01-01', DATE '2020-02-01', INTERVAL 1 WEEK))",
write={
"snowflake": "SELECT ARRAY_SIZE((SELECT ARRAY_AGG(*) FROM (SELECT DATEADD(WEEK, CAST(value AS INT), CAST('2020-01-01' AS DATE)) AS value FROM TABLE(FLATTEN(INPUT => ARRAY_GENERATE_RANGE(0, (DATEDIFF(WEEK, CAST('2020-01-01' AS DATE), CAST('2020-02-01' AS DATE)) + 1 - 1) + 1))) AS _u(seq, key, path, index, value, this))))",
},
)
def test_set_operation_specifiers(self):
self.validate_all(
"SELECT 1 EXCEPT ALL SELECT 1",

View file

@ -379,10 +379,6 @@ class TestDuckDB(Validator):
"JSON_EXTRACT_PATH_TEXT(x, '$.family')",
"x ->> '$.family'",
)
self.validate_identity(
"ATTACH DATABASE ':memory:' AS new_database", check_command_warning=True
)
self.validate_identity("DETACH DATABASE new_database", check_command_warning=True)
self.validate_identity(
"SELECT {'yes': 'duck', 'maybe': 'goose', 'huh': NULL, 'no': 'heron'}"
)
@ -1392,3 +1388,20 @@ class TestDuckDB(Validator):
else:
self.assertEqual(ignore_null.sql("duckdb"), func.sql("duckdb"))
self.assertNotIn("IGNORE NULLS", windowed_ignore_null.sql("duckdb"))
def test_attach_detach(self):
# ATTACH
self.validate_identity("ATTACH 'file.db'")
self.validate_identity("ATTACH ':memory:' AS db_alias")
self.validate_identity("ATTACH IF NOT EXISTS 'file.db' AS db_alias")
self.validate_identity("ATTACH 'file.db' AS db_alias (READ_ONLY)")
self.validate_identity("ATTACH 'file.db' (READ_ONLY FALSE, TYPE sqlite)")
self.validate_identity("ATTACH 'file.db' (TYPE POSTGRES, SCHEMA 'public')")
self.validate_identity("ATTACH DATABASE 'file.db'", "ATTACH 'file.db'")
# DETACH
self.validate_identity("DETACH new_database")
self.validate_identity("DETACH IF EXISTS file")
self.validate_identity("DETACH DATABASE db", "DETACH db")

View file

@ -708,6 +708,16 @@ class TestMySQL(Validator):
)
def test_mysql(self):
for func in ("CHAR_LENGTH", "CHARACTER_LENGTH"):
with self.subTest(f"Testing MySQL's {func}"):
self.validate_all(
f"SELECT {func}('foo')",
write={
"duckdb": "SELECT LENGTH('foo')",
"mysql": "SELECT CHAR_LENGTH('foo')",
},
)
self.validate_all(
"SELECT CONCAT('11', '22')",
read={
@ -1319,3 +1329,6 @@ COMMENT='客户账户表'"""
expression = self.parse_one("EXPLAIN ANALYZE SELECT * FROM t")
self.assertIsInstance(expression, exp.Describe)
self.assertEqual(expression.text("style"), "ANALYZE")
for format in ("JSON", "TRADITIONAL", "TREE"):
self.validate_identity(f"DESCRIBE FORMAT={format} UPDATE test SET test_col = 'abc'")

View file

@ -287,6 +287,17 @@ class TestOracle(Validator):
"clickhouse": "TRIM(BOTH 'h' FROM 'Hello World')",
},
)
self.validate_identity(
"SELECT /*+ ORDERED */* FROM tbl", "SELECT /*+ ORDERED */ * FROM tbl"
)
self.validate_identity(
"SELECT /* test */ /*+ ORDERED */* FROM tbl",
"/* test */ SELECT /*+ ORDERED */ * FROM tbl",
)
self.validate_identity(
"SELECT /*+ ORDERED */*/* test */ FROM tbl",
"SELECT /*+ ORDERED */ * /* test */ FROM tbl",
)
def test_join_marker(self):
self.validate_identity("SELECT e1.x, e2.x FROM e e1, e e2 WHERE e1.y (+) = e2.y")

View file

@ -127,10 +127,6 @@ class TestPostgres(Validator):
"pg_catalog.PG_TABLE_IS_VISIBLE(c.oid) "
"ORDER BY 2, 3"
)
self.validate_identity(
"/*+ some comment*/ SELECT b.foo, b.bar FROM baz AS b",
"/* + some comment */ SELECT b.foo, b.bar FROM baz AS b",
)
self.validate_identity(
"SELECT ARRAY[1, 2, 3] <@ ARRAY[1, 2]",
"SELECT ARRAY[1, 2] @> ARRAY[1, 2, 3]",
@ -819,6 +815,11 @@ class TestPostgres(Validator):
},
)
self.validate_identity(
"/*+ some comment*/ SELECT b.foo, b.bar FROM baz AS b",
"/* + some comment */ SELECT b.foo, b.bar FROM baz AS b",
)
def test_ddl(self):
# Checks that user-defined types are parsed into DataType instead of Identifier
self.parse_one("CREATE TABLE t (a udt)").this.expressions[0].args["kind"].assert_is(

View file

@ -7,6 +7,12 @@ class TestPresto(Validator):
dialect = "presto"
def test_cast(self):
self.validate_identity("DEALLOCATE PREPARE my_query", check_command_warning=True)
self.validate_identity("DESCRIBE INPUT x", check_command_warning=True)
self.validate_identity("DESCRIBE OUTPUT x", check_command_warning=True)
self.validate_identity(
"RESET SESSION hive.optimized_reader_enabled", check_command_warning=True
)
self.validate_identity("SELECT * FROM x qualify", "SELECT * FROM x AS qualify")
self.validate_identity("CAST(x AS IPADDRESS)")
self.validate_identity("CAST(x AS IPPREFIX)")
@ -722,7 +728,7 @@ class TestPresto(Validator):
"SELECT MIN_BY(a.id, a.timestamp, 3) FROM a",
write={
"clickhouse": "SELECT argMin(a.id, a.timestamp) FROM a",
"duckdb": "SELECT ARG_MIN(a.id, a.timestamp) FROM a",
"duckdb": "SELECT ARG_MIN(a.id, a.timestamp, 3) FROM a",
"presto": "SELECT MIN_BY(a.id, a.timestamp, 3) FROM a",
"snowflake": "SELECT MIN_BY(a.id, a.timestamp, 3) FROM a",
"spark": "SELECT MIN_BY(a.id, a.timestamp) FROM a",

View file

@ -12,3 +12,12 @@ class TestRisingWave(Validator):
"": "SELECT a FROM tbl FOR UPDATE",
},
)
self.validate_identity(
"CREATE SOURCE from_kafka (*, gen_i32_field INT AS int32_field + 2, gen_i64_field INT AS int64_field + 2, WATERMARK FOR time_col AS time_col - INTERVAL '5 SECOND') INCLUDE header foo VARCHAR AS myheader INCLUDE key AS mykey WITH (connector='kafka', topic='my_topic') FORMAT PLAIN ENCODE PROTOBUF (A=1, B=2) KEY ENCODE PROTOBUF (A=3, B=4)"
)
self.validate_identity(
"CREATE SINK my_sink AS SELECT * FROM A WITH (connector='kafka', topic='my_topic') FORMAT PLAIN ENCODE PROTOBUF (A=1, B=2) KEY ENCODE PROTOBUF (A=3, B=4)"
)
self.validate_identity(
"WITH t1 AS MATERIALIZED (SELECT 1), t2 AS NOT MATERIALIZED (SELECT 2) SELECT * FROM t1, t2"
)

View file

@ -1479,13 +1479,20 @@ WHERE
"snowflake": "CREATE OR REPLACE TRANSIENT TABLE a (id INT)",
},
)
self.validate_all(
"CREATE TABLE a (b INT)",
read={"teradata": "CREATE MULTISET TABLE a (b INT)"},
write={"snowflake": "CREATE TABLE a (b INT)"},
)
self.validate_identity("CREATE TABLE a TAG (key1='value_1', key2='value_2')")
self.validate_all(
"CREATE TABLE a TAG (key1='value_1')",
read={
"snowflake": "CREATE TABLE a WITH TAG (key1='value_1')",
},
)
for action in ("SET", "DROP"):
with self.subTest(f"ALTER COLUMN {action} NOT NULL"):
self.validate_all(
@ -2250,3 +2257,13 @@ SINGLE = TRUE""",
self.validate_identity(
"GRANT ALL PRIVILEGES ON FUNCTION mydb.myschema.ADD5(number) TO ROLE analyst"
)
def test_window_function_arg(self):
query = "SELECT * FROM TABLE(db.schema.FUNC(a) OVER ())"
ast = self.parse_one(query)
window = ast.find(exp.Window)
self.assertEqual(ast.sql("snowflake"), query)
self.assertEqual(len(list(ast.find_all(exp.Column))), 1)
self.assertEqual(window.this.sql("snowflake"), "db.schema.FUNC(a)")

View file

@ -1308,6 +1308,12 @@ WHERE
},
)
for fmt in ("WEEK", "WW", "WK"):
self.validate_identity(
f"SELECT DATEPART({fmt}, '2024-11-21')",
"SELECT DATEPART(WK, CAST('2024-11-21' AS DATETIME2))",
)
def test_convert(self):
self.validate_all(
"CONVERT(NVARCHAR(200), x)",

View file

@ -883,3 +883,5 @@ GRANT SELECT, INSERT ON FUNCTION tbl TO user
GRANT SELECT ON orders TO ROLE PUBLIC
GRANT SELECT ON nation TO alice WITH GRANT OPTION
GRANT DELETE ON SCHEMA finance TO bob
SELECT attach
SELECT detach

View file

@ -2,7 +2,7 @@ SELECT w.d + w.e AS c FROM w AS w;
SELECT CONCAT("w"."d", "w"."e") AS "c" FROM "w" AS "w";
SELECT CAST(w.d AS DATE) > w.e AS a FROM w AS w;
SELECT CAST("w"."d" AS DATE) > CAST("w"."e" AS DATETIME) AS "a" FROM "w" AS "w";
SELECT CAST("w"."d" AS DATE) > CAST("w"."e" AS DATE) AS "a" FROM "w" AS "w";
SELECT CAST(1 AS VARCHAR) AS a FROM w AS w;
SELECT CAST(1 AS VARCHAR) AS "a" FROM "w" AS "w";
@ -102,7 +102,7 @@ DATEDIFF('2023-01-01', '2023-01-02', DAY);
DATEDIFF(CAST('2023-01-01' AS DATETIME), CAST('2023-01-02' AS DATETIME), DAY);
SELECT "t"."d" > '2023-01-01' AS "d" FROM "temporal" AS "t";
SELECT "t"."d" > CAST('2023-01-01' AS DATETIME) AS "d" FROM "temporal" AS "t";
SELECT "t"."d" > CAST('2023-01-01' AS DATE) AS "d" FROM "temporal" AS "t";
SELECT "t"."d" > CAST('2023-01-01' AS DATETIME) AS "d" FROM "temporal" AS "t";
SELECT "t"."d" > CAST('2023-01-01' AS DATETIME) AS "d" FROM "temporal" AS "t";
@ -110,6 +110,17 @@ SELECT "t"."d" > CAST('2023-01-01' AS DATETIME) AS "d" FROM "temporal" AS "t";
SELECT "t"."t" > '2023-01-01 00:00:01' AS "t" FROM "temporal" AS "t";
SELECT "t"."t" > CAST('2023-01-01 00:00:01' AS DATETIME) AS "t" FROM "temporal" AS "t";
WITH "t" AS (SELECT CAST("ext"."created_at" AS TIMESTAMP) AS "created_at" FROM "ext" AS "ext") SELECT "t"."created_at" > '2024-10-01 12:05:02' AS "col" FROM "t" AS "t";
WITH "t" AS (SELECT CAST("ext"."created_at" AS TIMESTAMP) AS "created_at" FROM "ext" AS "ext") SELECT "t"."created_at" > CAST('2024-10-01 12:05:02' AS TIMESTAMP) AS "col" FROM "t" AS "t";
# dialect: mysql
SELECT `t`.`d` < '2023-01-01 00:00:01' AS `col` FROM `temporal` AS `t`;
SELECT CAST(`t`.`d` AS DATETIME) < CAST('2023-01-01 00:00:01' AS DATETIME) AS `col` FROM `temporal` AS `t`;
# dialect: mysql
SELECT CAST(`t`.`some_col` AS DATE) < CAST(`t`.`other_col` AS CHAR) AS `col` FROM `other_table` AS `t`;
SELECT CAST(CAST(`t`.`some_col` AS DATE) AS DATETIME) < CAST(CAST(`t`.`other_col` AS CHAR) AS DATETIME) AS `col` FROM `other_table` AS `t`;
--------------------------------------
-- Remove redundant casts
--------------------------------------

View file

@ -736,8 +736,8 @@ WITH "salesreturns" AS (
"date_dim"."d_date" AS "d_date"
FROM "date_dim" AS "date_dim"
WHERE
CAST("date_dim"."d_date" AS DATETIME) <= CAST('2002-09-05' AS DATE)
AND CAST("date_dim"."d_date" AS DATETIME) >= CAST('2002-08-22' AS DATE)
CAST("date_dim"."d_date" AS DATE) <= CAST('2002-09-05' AS DATE)
AND CAST("date_dim"."d_date" AS DATE) >= CAST('2002-08-22' AS DATE)
), "ssr" AS (
SELECT
"store"."s_store_id" AS "s_store_id",
@ -1853,8 +1853,8 @@ SELECT
FROM "web_sales" AS "web_sales"
JOIN "date_dim" AS "date_dim"
ON "date_dim"."d_date_sk" = "web_sales"."ws_sold_date_sk"
AND CAST("date_dim"."d_date" AS DATETIME) <= CAST('2000-06-10' AS DATE)
AND CAST("date_dim"."d_date" AS DATETIME) >= CAST('2000-05-11' AS DATE)
AND CAST("date_dim"."d_date" AS DATE) <= CAST('2000-06-10' AS DATE)
AND CAST("date_dim"."d_date" AS DATE) >= CAST('2000-05-11' AS DATE)
JOIN "item" AS "item"
ON "item"."i_category" IN ('Home', 'Men', 'Women')
AND "item"."i_item_sk" = "web_sales"."ws_item_sk"
@ -2422,7 +2422,7 @@ JOIN "date_dim" AS "date_dim"
AND "date_dim"."d_date" >= '2002-3-01'
AND (
CAST('2002-3-01' AS DATE) + INTERVAL '60' DAY
) >= CAST("date_dim"."d_date" AS DATETIME)
) >= CAST("date_dim"."d_date" AS DATE)
WHERE
"_u_3"."_u_4" IS NULL
AND ARRAY_ANY("_u_0"."_u_2", "_x" -> "cs1"."cs_warehouse_sk" <> "_x")
@ -2731,8 +2731,8 @@ SELECT
FROM "catalog_sales" AS "catalog_sales"
JOIN "date_dim" AS "date_dim"
ON "catalog_sales"."cs_sold_date_sk" = "date_dim"."d_date_sk"
AND CAST("date_dim"."d_date" AS DATETIME) <= CAST('2001-03-05' AS DATE)
AND CAST("date_dim"."d_date" AS DATETIME) >= CAST('2001-02-03' AS DATE)
AND CAST("date_dim"."d_date" AS DATE) <= CAST('2001-03-05' AS DATE)
AND CAST("date_dim"."d_date" AS DATE) >= CAST('2001-02-03' AS DATE)
JOIN "item" AS "item"
ON "catalog_sales"."cs_item_sk" = "item"."i_item_sk"
AND "item"."i_category" IN ('Children', 'Women', 'Electronics')
@ -2811,8 +2811,8 @@ WITH "x" AS (
FROM "inventory" AS "inventory"
JOIN "date_dim" AS "date_dim"
ON "date_dim"."d_date_sk" = "inventory"."inv_date_sk"
AND CAST("date_dim"."d_date" AS DATETIME) <= CAST('2000-06-12' AS DATE)
AND CAST("date_dim"."d_date" AS DATETIME) >= CAST('2000-04-13' AS DATE)
AND CAST("date_dim"."d_date" AS DATE) <= CAST('2000-06-12' AS DATE)
AND CAST("date_dim"."d_date" AS DATE) >= CAST('2000-04-13' AS DATE)
JOIN "item" AS "item"
ON "inventory"."inv_item_sk" = "item"."i_item_sk"
AND "item"."i_current_price" <= 1.49
@ -3944,7 +3944,7 @@ WITH "catalog_sales_2" AS (
FROM "date_dim" AS "date_dim"
WHERE
"date_dim"."d_date" >= '2001-03-04'
AND CAST("date_dim"."d_date" AS DATETIME) <= CAST('2001-06-02' AS DATE)
AND CAST("date_dim"."d_date" AS DATE) <= CAST('2001-06-02' AS DATE)
), "_u_0" AS (
SELECT
1.3 * AVG("catalog_sales"."cs_ext_discount_amt") AS "_col_0",
@ -4510,8 +4510,8 @@ JOIN "inventory" AS "inventory"
AND "inventory"."inv_quantity_on_hand" >= 100
JOIN "date_dim" AS "date_dim"
ON "date_dim"."d_date_sk" = "inventory"."inv_date_sk"
AND CAST("date_dim"."d_date" AS DATETIME) <= CAST('1999-05-05' AS DATE)
AND CAST("date_dim"."d_date" AS DATETIME) >= CAST('1999-03-06' AS DATE)
AND CAST("date_dim"."d_date" AS DATE) <= CAST('1999-05-05' AS DATE)
AND CAST("date_dim"."d_date" AS DATE) >= CAST('1999-03-06' AS DATE)
WHERE
"item"."i_current_price" <= 50
AND "item"."i_current_price" >= 20
@ -4787,8 +4787,8 @@ LEFT JOIN "catalog_returns" AS "catalog_returns"
AND "catalog_returns"."cr_order_number" = "catalog_sales"."cs_order_number"
JOIN "date_dim" AS "date_dim"
ON "catalog_sales"."cs_sold_date_sk" = "date_dim"."d_date_sk"
AND CAST("date_dim"."d_date" AS DATETIME) <= CAST('2002-07-01' AS DATE)
AND CAST("date_dim"."d_date" AS DATETIME) >= CAST('2002-05-02' AS DATE)
AND CAST("date_dim"."d_date" AS DATE) <= CAST('2002-07-01' AS DATE)
AND CAST("date_dim"."d_date" AS DATE) >= CAST('2002-05-02' AS DATE)
JOIN "item" AS "item"
ON "catalog_sales"."cs_item_sk" = "item"."i_item_sk"
AND "item"."i_current_price" <= 1.49
@ -10318,8 +10318,8 @@ WITH "date_dim_2" AS (
"date_dim"."d_date" AS "d_date"
FROM "date_dim" AS "date_dim"
WHERE
CAST("date_dim"."d_date" AS DATETIME) <= CAST('2001-09-15' AS DATE)
AND CAST("date_dim"."d_date" AS DATETIME) >= CAST('2001-08-16' AS DATE)
CAST("date_dim"."d_date" AS DATE) <= CAST('2001-09-15' AS DATE)
AND CAST("date_dim"."d_date" AS DATE) >= CAST('2001-08-16' AS DATE)
), "store_2" AS (
SELECT
"store"."s_store_sk" AS "s_store_sk"
@ -10828,8 +10828,8 @@ WITH "date_dim_2" AS (
"date_dim"."d_date" AS "d_date"
FROM "date_dim" AS "date_dim"
WHERE
CAST("date_dim"."d_date" AS DATETIME) <= CAST('2000-09-25' AS DATE)
AND CAST("date_dim"."d_date" AS DATETIME) >= CAST('2000-08-26' AS DATE)
CAST("date_dim"."d_date" AS DATE) <= CAST('2000-09-25' AS DATE)
AND CAST("date_dim"."d_date" AS DATE) >= CAST('2000-08-26' AS DATE)
), "item_2" AS (
SELECT
"item"."i_item_sk" AS "i_item_sk",
@ -11109,8 +11109,8 @@ JOIN "store_sales" AS "store_sales"
ON "item"."i_item_sk" = "store_sales"."ss_item_sk"
JOIN "date_dim" AS "date_dim"
ON "date_dim"."d_date_sk" = "inventory"."inv_date_sk"
AND CAST("date_dim"."d_date" AS DATETIME) <= CAST('1998-06-26' AS DATE)
AND CAST("date_dim"."d_date" AS DATETIME) >= CAST('1998-04-27' AS DATE)
AND CAST("date_dim"."d_date" AS DATE) <= CAST('1998-06-26' AS DATE)
AND CAST("date_dim"."d_date" AS DATE) >= CAST('1998-04-27' AS DATE)
WHERE
"item"."i_current_price" <= 93
AND "item"."i_current_price" >= 63
@ -12180,7 +12180,7 @@ WITH "web_sales_2" AS (
FROM "date_dim" AS "date_dim"
WHERE
"date_dim"."d_date" >= '2002-03-29'
AND CAST("date_dim"."d_date" AS DATETIME) <= CAST('2002-06-27' AS DATE)
AND CAST("date_dim"."d_date" AS DATE) <= CAST('2002-06-27' AS DATE)
), "_u_0" AS (
SELECT
1.3 * AVG("web_sales"."ws_ext_discount_amt") AS "_col_0",
@ -12321,7 +12321,7 @@ JOIN "date_dim" AS "date_dim"
AND "date_dim"."d_date_sk" = "ws1"."ws_ship_date_sk"
AND (
CAST('2000-3-01' AS DATE) + INTERVAL '60' DAY
) >= CAST("date_dim"."d_date" AS DATETIME)
) >= CAST("date_dim"."d_date" AS DATE)
JOIN "web_site" AS "web_site"
ON "web_site"."web_company_name" = 'pri'
AND "web_site"."web_site_sk" = "ws1"."ws_web_site_sk"
@ -12411,7 +12411,7 @@ JOIN "date_dim" AS "date_dim"
AND "date_dim"."d_date_sk" = "ws1"."ws_ship_date_sk"
AND (
CAST('2000-4-01' AS DATE) + INTERVAL '60' DAY
) >= CAST("date_dim"."d_date" AS DATETIME)
) >= CAST("date_dim"."d_date" AS DATE)
JOIN "web_site" AS "web_site"
ON "web_site"."web_company_name" = 'pri'
AND "web_site"."web_site_sk" = "ws1"."ws_web_site_sk"
@ -12595,8 +12595,8 @@ SELECT
FROM "store_sales" AS "store_sales"
JOIN "date_dim" AS "date_dim"
ON "date_dim"."d_date_sk" = "store_sales"."ss_sold_date_sk"
AND CAST("date_dim"."d_date" AS DATETIME) <= CAST('2000-06-17' AS DATE)
AND CAST("date_dim"."d_date" AS DATETIME) >= CAST('2000-05-18' AS DATE)
AND CAST("date_dim"."d_date" AS DATE) <= CAST('2000-06-17' AS DATE)
AND CAST("date_dim"."d_date" AS DATE) >= CAST('2000-05-18' AS DATE)
JOIN "item" AS "item"
ON "item"."i_category" IN ('Men', 'Home', 'Electronics')
AND "item"."i_item_sk" = "store_sales"."ss_item_sk"

View file

@ -132,7 +132,6 @@ class TestOptimizer(unittest.TestCase):
func,
pretty=False,
execute=False,
set_dialect=False,
only=None,
**kwargs,
):
@ -158,7 +157,7 @@ class TestOptimizer(unittest.TestCase):
validate_qualify_columns
)
if set_dialect and dialect:
if dialect:
func_kwargs["dialect"] = dialect
future = pool.submit(parse_and_optimize, func, sql, dialect, **func_kwargs)
@ -207,7 +206,6 @@ class TestOptimizer(unittest.TestCase):
pretty=True,
execute=True,
schema=schema,
set_dialect=True,
)
def test_isolate_table_selects(self):
@ -235,7 +233,6 @@ class TestOptimizer(unittest.TestCase):
optimizer.qualify_tables.qualify_tables,
db="db",
catalog="c",
set_dialect=True,
)
def test_normalize(self):
@ -446,11 +443,8 @@ class TestOptimizer(unittest.TestCase):
qualify_columns,
execute=True,
schema=self.schema,
set_dialect=True,
)
self.check_file(
"qualify_columns_ddl", qualify_columns, schema=self.schema, set_dialect=True
)
self.check_file("qualify_columns_ddl", qualify_columns, schema=self.schema)
def test_qualify_columns__with_invisible(self):
schema = MappingSchema(self.schema, {"x": {"a"}, "y": {"b"}, "z": {"b"}})
@ -475,7 +469,6 @@ class TestOptimizer(unittest.TestCase):
self.check_file(
"normalize_identifiers",
optimizer.normalize_identifiers.normalize_identifiers,
set_dialect=True,
)
self.assertEqual(optimizer.normalize_identifiers.normalize_identifiers("a%").sql(), '"a%"')
@ -484,14 +477,13 @@ class TestOptimizer(unittest.TestCase):
self.check_file(
"quote_identifiers",
optimizer.qualify_columns.quote_identifiers,
set_dialect=True,
)
def test_pushdown_projection(self):
self.check_file("pushdown_projections", pushdown_projections, schema=self.schema)
def test_simplify(self):
self.check_file("simplify", simplify, set_dialect=True)
self.check_file("simplify", simplify)
expression = parse_one("SELECT a, c, b FROM table1 WHERE 1 = 1")
self.assertEqual(simplify(simplify(expression.find(exp.Where))).sql(), "WHERE TRUE")