1
0
Fork 0

Merging upstream version 26.2.1.

Signed-off-by: Daniel Baumann <daniel@debian.org>
This commit is contained in:
Daniel Baumann 2025-02-13 22:00:08 +01:00
parent a5399bd16b
commit 4d0635d636
Signed by: daniel
GPG key ID: FBB4F0E80A80222F
85 changed files with 57142 additions and 52288 deletions

View file

@ -1,10 +1,12 @@
on: [pull_request]
on:
pull_request:
paths:
- 'sqlglotrs/**'
name: benchmark pull requests
jobs:
run-benchmark:
name: run benchmark
runs-on: ubuntu-latest
if: ${{ false }} # Uncomment this when the workflow is fixed (currently fails in PRs)
steps:
- uses: actions/checkout@v4
- uses: boa-dev/criterion-compare-action@v3

View file

@ -1,6 +1,57 @@
Changelog
=========
## [v26.2.0] - 2025-01-14
### :boom: BREAKING CHANGES
- due to [`f3fcc10`](https://github.com/tobymao/sqlglot/commit/f3fcc1013dfcfdaa388ba3426ed82c4fe0eefab1) - allow limit, offset to be used as both modifiers and aliases *(PR [#4589](https://github.com/tobymao/sqlglot/pull/4589) by [@georgesittas](https://github.com/georgesittas))*:
allow limit, offset to be used as both modifiers and aliases (#4589)
- due to [`b7ab3f1`](https://github.com/tobymao/sqlglot/commit/b7ab3f1697bda3d67a1183e6cd78dbd13777112b) - exp.Merge condition for Trino/Postgres *(PR [#4596](https://github.com/tobymao/sqlglot/pull/4596) by [@MikeWallis42](https://github.com/MikeWallis42))*:
exp.Merge condition for Trino/Postgres (#4596)
- due to [`e617d40`](https://github.com/tobymao/sqlglot/commit/e617d407ece96d3c3311c95936ccdca6ecd35a70) - extend ANALYZE common syntax to cover multiple dialects *(PR [#4591](https://github.com/tobymao/sqlglot/pull/4591) by [@zashroof](https://github.com/zashroof))*:
extend ANALYZE common syntax to cover multiple dialects (#4591)
### :sparkles: New Features
- [`c75016a`](https://github.com/tobymao/sqlglot/commit/c75016a83cda5eb328f854a8628884b90dec10e4) - parse analyze compute statistics *(PR [#4547](https://github.com/tobymao/sqlglot/pull/4547) by [@zashroof](https://github.com/zashroof))*
- [`986a1da`](https://github.com/tobymao/sqlglot/commit/986a1da98fa5648bc3e364ae436dc4168a1b33ed) - Druid dialect *(PR [#4579](https://github.com/tobymao/sqlglot/pull/4579) by [@betodealmeida](https://github.com/betodealmeida))*
- [`bc9975f`](https://github.com/tobymao/sqlglot/commit/bc9975fe80d66b0c25b8755f1757f049edb4d0be) - move to rustc fx hashmap *(PR [#4588](https://github.com/tobymao/sqlglot/pull/4588) by [@benfdking](https://github.com/benfdking))*
- [`853cbe6`](https://github.com/tobymao/sqlglot/commit/853cbe655f2aa3fa4debb8091b335eb6f9530390) - cleaner IS_ASCII for TSQL *(PR [#4592](https://github.com/tobymao/sqlglot/pull/4592) by [@pruzko](https://github.com/pruzko))*
- [`3ebd879`](https://github.com/tobymao/sqlglot/commit/3ebd87919a4a9947c077c657c03ba2d2b3799620) - LOGICAL_AND and LOGICAL_OR for Oracle *(PR [#4593](https://github.com/tobymao/sqlglot/pull/4593) by [@pruzko](https://github.com/pruzko))*
- [`e617d40`](https://github.com/tobymao/sqlglot/commit/e617d407ece96d3c3311c95936ccdca6ecd35a70) - extend ANALYZE common syntax to cover multiple dialects *(PR [#4591](https://github.com/tobymao/sqlglot/pull/4591) by [@zashroof](https://github.com/zashroof))*
### :bug: Bug Fixes
- [`766d698`](https://github.com/tobymao/sqlglot/commit/766d69886ac088de7dd9a22d71124ffa1b36d003) - **postgres**: Revert exp.StrPos generation *(PR [#4586](https://github.com/tobymao/sqlglot/pull/4586) by [@VaggelisD](https://github.com/VaggelisD))*
- [`f3fcc10`](https://github.com/tobymao/sqlglot/commit/f3fcc1013dfcfdaa388ba3426ed82c4fe0eefab1) - **parser**: allow limit, offset to be used as both modifiers and aliases *(PR [#4589](https://github.com/tobymao/sqlglot/pull/4589) by [@georgesittas](https://github.com/georgesittas))*
- :arrow_lower_right: *fixes issue [#4575](https://github.com/tobymao/sqlglot/issues/4575) opened by [@baruchoxman](https://github.com/baruchoxman)*
- [`2bea466`](https://github.com/tobymao/sqlglot/commit/2bea466cbef3adfc09185176ee38ddf820b3f7ab) - **optimizer**: unions on nested subqueries *(PR [#4603](https://github.com/tobymao/sqlglot/pull/4603) by [@barakalon](https://github.com/barakalon))*
- [`199508a`](https://github.com/tobymao/sqlglot/commit/199508a77c62f75b5e12fee47828d34e4903c706) - **snowflake**: treat $ as part of the json path key identifier *(PR [#4604](https://github.com/tobymao/sqlglot/pull/4604) by [@georgesittas](https://github.com/georgesittas))*
- [`b7ab3f1`](https://github.com/tobymao/sqlglot/commit/b7ab3f1697bda3d67a1183e6cd78dbd13777112b) - exp.Merge condition for Trino/Postgres *(PR [#4596](https://github.com/tobymao/sqlglot/pull/4596) by [@MikeWallis42](https://github.com/MikeWallis42))*
- :arrow_lower_right: *fixes issue [#4595](https://github.com/tobymao/sqlglot/issues/4595) opened by [@MikeWallis42](https://github.com/MikeWallis42)*
### :recycle: Refactors
- [`c0f7309`](https://github.com/tobymao/sqlglot/commit/c0f7309327e21204a0a0f273712d3097f02f6796) - simplify `trie_filter` closure in `Tokenizer` initialization *(PR [#4599](https://github.com/tobymao/sqlglot/pull/4599) by [@gvozdvmozgu](https://github.com/gvozdvmozgu))*
- [`fb93219`](https://github.com/tobymao/sqlglot/commit/fb932198087e5e3aa1a42e65ac30f28e24c6d84f) - replace `std::mem::replace` with `std::mem::take` and `Vec::drain` *(PR [#4600](https://github.com/tobymao/sqlglot/pull/4600) by [@gvozdvmozgu](https://github.com/gvozdvmozgu))*
### :wrench: Chores
- [`672d656`](https://github.com/tobymao/sqlglot/commit/672d656eb5a014ba42492ba2c2a9a33ebd145bd8) - clean up ANALYZE implementation *(PR [#4607](https://github.com/tobymao/sqlglot/pull/4607) by [@georgesittas](https://github.com/georgesittas))*
- [`e58a8cb`](https://github.com/tobymao/sqlglot/commit/e58a8cb4d388d22eff8fd2cca08f38e4c42075d6) - apply clippy fixes *(PR [#4608](https://github.com/tobymao/sqlglot/pull/4608) by [@benfdking](https://github.com/benfdking))*
- [`5502c94`](https://github.com/tobymao/sqlglot/commit/5502c94d665a2ed354e44beb145e767bab00adfa) - bump sqlglotrs to 0.3.5 *(commit by [@georgesittas](https://github.com/georgesittas))*
## [v26.1.3] - 2025-01-09
### :bug: Bug Fixes
- [`d250846`](https://github.com/tobymao/sqlglot/commit/d250846d05711ac62a45efd4930f0ca712841b11) - **snowflake**: generate LIMIT when OFFSET exists [#4575](https://github.com/tobymao/sqlglot/pull/4575) *(PR [#4581](https://github.com/tobymao/sqlglot/pull/4581) by [@geooo109](https://github.com/geooo109))*
### :wrench: Chores
- [`ffbb935`](https://github.com/tobymao/sqlglot/commit/ffbb9350f8d0decab4555471ec2e468fa2741f5f) - install python 3.7 when building windows wheel for sqlglotrs *(PR [#4585](https://github.com/tobymao/sqlglot/pull/4585) by [@georgesittas](https://github.com/georgesittas))*
- [`1ea05c0`](https://github.com/tobymao/sqlglot/commit/1ea05c0b4e3cf53482058b22ecac7ec7c1de525d) - bump sqlglotrs to 0.3.4 *(commit by [@georgesittas](https://github.com/georgesittas))*
## [v26.1.2] - 2025-01-08
### :wrench: Chores
- [`e33af0b`](https://github.com/tobymao/sqlglot/commit/e33af0bcd859571dab68aef3a1fc9ecbf5c49e71) - try setup-python@v5 in the publish job *(PR [#4582](https://github.com/tobymao/sqlglot/pull/4582) by [@georgesittas](https://github.com/georgesittas))*
@ -5590,3 +5641,5 @@ Changelog
[v26.1.0]: https://github.com/tobymao/sqlglot/compare/v26.0.1...v26.1.0
[v26.1.1]: https://github.com/tobymao/sqlglot/compare/v26.1.0...v26.1.1
[v26.1.2]: https://github.com/tobymao/sqlglot/compare/v26.1.1...v26.1.2
[v26.1.3]: https://github.com/tobymao/sqlglot/compare/v26.1.2...v26.1.3
[v26.2.0]: https://github.com/tobymao/sqlglot/compare/v26.1.3...v26.2.0

File diff suppressed because one or more lines are too long

View file

@ -76,8 +76,8 @@
</span><span id="L-12"><a href="#L-12"><span class="linenos">12</span></a><span class="n">__version_tuple__</span><span class="p">:</span> <span class="n">VERSION_TUPLE</span>
</span><span id="L-13"><a href="#L-13"><span class="linenos">13</span></a><span class="n">version_tuple</span><span class="p">:</span> <span class="n">VERSION_TUPLE</span>
</span><span id="L-14"><a href="#L-14"><span class="linenos">14</span></a>
</span><span id="L-15"><a href="#L-15"><span class="linenos">15</span></a><span class="n">__version__</span> <span class="o">=</span> <span class="n">version</span> <span class="o">=</span> <span class="s1">&#39;26.1.2&#39;</span>
</span><span id="L-16"><a href="#L-16"><span class="linenos">16</span></a><span class="n">__version_tuple__</span> <span class="o">=</span> <span class="n">version_tuple</span> <span class="o">=</span> <span class="p">(</span><span class="mi">26</span><span class="p">,</span> <span class="mi">1</span><span class="p">,</span> <span class="mi">2</span><span class="p">)</span>
</span><span id="L-15"><a href="#L-15"><span class="linenos">15</span></a><span class="n">__version__</span> <span class="o">=</span> <span class="n">version</span> <span class="o">=</span> <span class="s1">&#39;26.2.0&#39;</span>
</span><span id="L-16"><a href="#L-16"><span class="linenos">16</span></a><span class="n">__version_tuple__</span> <span class="o">=</span> <span class="n">version_tuple</span> <span class="o">=</span> <span class="p">(</span><span class="mi">26</span><span class="p">,</span> <span class="mi">2</span><span class="p">,</span> <span class="mi">0</span><span class="p">)</span>
</span></pre></div>
@ -97,7 +97,7 @@
<section id="version">
<div class="attr variable">
<span class="name">version</span><span class="annotation">: str</span> =
<span class="default_value">&#39;26.1.2&#39;</span>
<span class="default_value">&#39;26.2.0&#39;</span>
</div>
@ -109,7 +109,7 @@
<section id="version_tuple">
<div class="attr variable">
<span class="name">version_tuple</span><span class="annotation">: object</span> =
<span class="default_value">(26, 1, 2)</span>
<span class="default_value">(26, 2, 0)</span>
</div>

View file

@ -41,6 +41,7 @@
<li><a href="dialects/dialect.html">dialect</a></li>
<li><a href="dialects/doris.html">doris</a></li>
<li><a href="dialects/drill.html">drill</a></li>
<li><a href="dialects/druid.html">druid</a></li>
<li><a href="dialects/duckdb.html">duckdb</a></li>
<li><a href="dialects/hive.html">hive</a></li>
<li><a href="dialects/materialize.html">materialize</a></li>
@ -212,25 +213,26 @@ dialect implementations in order to understand how their various components can
</span><span id="L-68"><a href="#L-68"><span class="linenos">68</span></a><span class="kn">from</span><span class="w"> </span><span class="nn">sqlglot.dialects.dialect</span><span class="w"> </span><span class="kn">import</span> <span class="n">Dialect</span><span class="p">,</span> <span class="n">Dialects</span>
</span><span id="L-69"><a href="#L-69"><span class="linenos">69</span></a><span class="kn">from</span><span class="w"> </span><span class="nn">sqlglot.dialects.doris</span><span class="w"> </span><span class="kn">import</span> <span class="n">Doris</span>
</span><span id="L-70"><a href="#L-70"><span class="linenos">70</span></a><span class="kn">from</span><span class="w"> </span><span class="nn">sqlglot.dialects.drill</span><span class="w"> </span><span class="kn">import</span> <span class="n">Drill</span>
</span><span id="L-71"><a href="#L-71"><span class="linenos">71</span></a><span class="kn">from</span><span class="w"> </span><span class="nn">sqlglot.dialects.duckdb</span><span class="w"> </span><span class="kn">import</span> <span class="n">DuckDB</span>
</span><span id="L-72"><a href="#L-72"><span class="linenos">72</span></a><span class="kn">from</span><span class="w"> </span><span class="nn">sqlglot.dialects.hive</span><span class="w"> </span><span class="kn">import</span> <span class="n">Hive</span>
</span><span id="L-73"><a href="#L-73"><span class="linenos">73</span></a><span class="kn">from</span><span class="w"> </span><span class="nn">sqlglot.dialects.materialize</span><span class="w"> </span><span class="kn">import</span> <span class="n">Materialize</span>
</span><span id="L-74"><a href="#L-74"><span class="linenos">74</span></a><span class="kn">from</span><span class="w"> </span><span class="nn">sqlglot.dialects.mysql</span><span class="w"> </span><span class="kn">import</span> <span class="n">MySQL</span>
</span><span id="L-75"><a href="#L-75"><span class="linenos">75</span></a><span class="kn">from</span><span class="w"> </span><span class="nn">sqlglot.dialects.oracle</span><span class="w"> </span><span class="kn">import</span> <span class="n">Oracle</span>
</span><span id="L-76"><a href="#L-76"><span class="linenos">76</span></a><span class="kn">from</span><span class="w"> </span><span class="nn">sqlglot.dialects.postgres</span><span class="w"> </span><span class="kn">import</span> <span class="n">Postgres</span>
</span><span id="L-77"><a href="#L-77"><span class="linenos">77</span></a><span class="kn">from</span><span class="w"> </span><span class="nn">sqlglot.dialects.presto</span><span class="w"> </span><span class="kn">import</span> <span class="n">Presto</span>
</span><span id="L-78"><a href="#L-78"><span class="linenos">78</span></a><span class="kn">from</span><span class="w"> </span><span class="nn">sqlglot.dialects.prql</span><span class="w"> </span><span class="kn">import</span> <span class="n">PRQL</span>
</span><span id="L-79"><a href="#L-79"><span class="linenos">79</span></a><span class="kn">from</span><span class="w"> </span><span class="nn">sqlglot.dialects.redshift</span><span class="w"> </span><span class="kn">import</span> <span class="n">Redshift</span>
</span><span id="L-80"><a href="#L-80"><span class="linenos">80</span></a><span class="kn">from</span><span class="w"> </span><span class="nn">sqlglot.dialects.risingwave</span><span class="w"> </span><span class="kn">import</span> <span class="n">RisingWave</span>
</span><span id="L-81"><a href="#L-81"><span class="linenos">81</span></a><span class="kn">from</span><span class="w"> </span><span class="nn">sqlglot.dialects.snowflake</span><span class="w"> </span><span class="kn">import</span> <span class="n">Snowflake</span>
</span><span id="L-82"><a href="#L-82"><span class="linenos">82</span></a><span class="kn">from</span><span class="w"> </span><span class="nn">sqlglot.dialects.spark</span><span class="w"> </span><span class="kn">import</span> <span class="n">Spark</span>
</span><span id="L-83"><a href="#L-83"><span class="linenos">83</span></a><span class="kn">from</span><span class="w"> </span><span class="nn">sqlglot.dialects.spark2</span><span class="w"> </span><span class="kn">import</span> <span class="n">Spark2</span>
</span><span id="L-84"><a href="#L-84"><span class="linenos">84</span></a><span class="kn">from</span><span class="w"> </span><span class="nn">sqlglot.dialects.sqlite</span><span class="w"> </span><span class="kn">import</span> <span class="n">SQLite</span>
</span><span id="L-85"><a href="#L-85"><span class="linenos">85</span></a><span class="kn">from</span><span class="w"> </span><span class="nn">sqlglot.dialects.starrocks</span><span class="w"> </span><span class="kn">import</span> <span class="n">StarRocks</span>
</span><span id="L-86"><a href="#L-86"><span class="linenos">86</span></a><span class="kn">from</span><span class="w"> </span><span class="nn">sqlglot.dialects.tableau</span><span class="w"> </span><span class="kn">import</span> <span class="n">Tableau</span>
</span><span id="L-87"><a href="#L-87"><span class="linenos">87</span></a><span class="kn">from</span><span class="w"> </span><span class="nn">sqlglot.dialects.teradata</span><span class="w"> </span><span class="kn">import</span> <span class="n">Teradata</span>
</span><span id="L-88"><a href="#L-88"><span class="linenos">88</span></a><span class="kn">from</span><span class="w"> </span><span class="nn">sqlglot.dialects.trino</span><span class="w"> </span><span class="kn">import</span> <span class="n">Trino</span>
</span><span id="L-89"><a href="#L-89"><span class="linenos">89</span></a><span class="kn">from</span><span class="w"> </span><span class="nn">sqlglot.dialects.tsql</span><span class="w"> </span><span class="kn">import</span> <span class="n">TSQL</span>
</span><span id="L-71"><a href="#L-71"><span class="linenos">71</span></a><span class="kn">from</span><span class="w"> </span><span class="nn">sqlglot.dialects.druid</span><span class="w"> </span><span class="kn">import</span> <span class="n">Druid</span>
</span><span id="L-72"><a href="#L-72"><span class="linenos">72</span></a><span class="kn">from</span><span class="w"> </span><span class="nn">sqlglot.dialects.duckdb</span><span class="w"> </span><span class="kn">import</span> <span class="n">DuckDB</span>
</span><span id="L-73"><a href="#L-73"><span class="linenos">73</span></a><span class="kn">from</span><span class="w"> </span><span class="nn">sqlglot.dialects.hive</span><span class="w"> </span><span class="kn">import</span> <span class="n">Hive</span>
</span><span id="L-74"><a href="#L-74"><span class="linenos">74</span></a><span class="kn">from</span><span class="w"> </span><span class="nn">sqlglot.dialects.materialize</span><span class="w"> </span><span class="kn">import</span> <span class="n">Materialize</span>
</span><span id="L-75"><a href="#L-75"><span class="linenos">75</span></a><span class="kn">from</span><span class="w"> </span><span class="nn">sqlglot.dialects.mysql</span><span class="w"> </span><span class="kn">import</span> <span class="n">MySQL</span>
</span><span id="L-76"><a href="#L-76"><span class="linenos">76</span></a><span class="kn">from</span><span class="w"> </span><span class="nn">sqlglot.dialects.oracle</span><span class="w"> </span><span class="kn">import</span> <span class="n">Oracle</span>
</span><span id="L-77"><a href="#L-77"><span class="linenos">77</span></a><span class="kn">from</span><span class="w"> </span><span class="nn">sqlglot.dialects.postgres</span><span class="w"> </span><span class="kn">import</span> <span class="n">Postgres</span>
</span><span id="L-78"><a href="#L-78"><span class="linenos">78</span></a><span class="kn">from</span><span class="w"> </span><span class="nn">sqlglot.dialects.presto</span><span class="w"> </span><span class="kn">import</span> <span class="n">Presto</span>
</span><span id="L-79"><a href="#L-79"><span class="linenos">79</span></a><span class="kn">from</span><span class="w"> </span><span class="nn">sqlglot.dialects.prql</span><span class="w"> </span><span class="kn">import</span> <span class="n">PRQL</span>
</span><span id="L-80"><a href="#L-80"><span class="linenos">80</span></a><span class="kn">from</span><span class="w"> </span><span class="nn">sqlglot.dialects.redshift</span><span class="w"> </span><span class="kn">import</span> <span class="n">Redshift</span>
</span><span id="L-81"><a href="#L-81"><span class="linenos">81</span></a><span class="kn">from</span><span class="w"> </span><span class="nn">sqlglot.dialects.risingwave</span><span class="w"> </span><span class="kn">import</span> <span class="n">RisingWave</span>
</span><span id="L-82"><a href="#L-82"><span class="linenos">82</span></a><span class="kn">from</span><span class="w"> </span><span class="nn">sqlglot.dialects.snowflake</span><span class="w"> </span><span class="kn">import</span> <span class="n">Snowflake</span>
</span><span id="L-83"><a href="#L-83"><span class="linenos">83</span></a><span class="kn">from</span><span class="w"> </span><span class="nn">sqlglot.dialects.spark</span><span class="w"> </span><span class="kn">import</span> <span class="n">Spark</span>
</span><span id="L-84"><a href="#L-84"><span class="linenos">84</span></a><span class="kn">from</span><span class="w"> </span><span class="nn">sqlglot.dialects.spark2</span><span class="w"> </span><span class="kn">import</span> <span class="n">Spark2</span>
</span><span id="L-85"><a href="#L-85"><span class="linenos">85</span></a><span class="kn">from</span><span class="w"> </span><span class="nn">sqlglot.dialects.sqlite</span><span class="w"> </span><span class="kn">import</span> <span class="n">SQLite</span>
</span><span id="L-86"><a href="#L-86"><span class="linenos">86</span></a><span class="kn">from</span><span class="w"> </span><span class="nn">sqlglot.dialects.starrocks</span><span class="w"> </span><span class="kn">import</span> <span class="n">StarRocks</span>
</span><span id="L-87"><a href="#L-87"><span class="linenos">87</span></a><span class="kn">from</span><span class="w"> </span><span class="nn">sqlglot.dialects.tableau</span><span class="w"> </span><span class="kn">import</span> <span class="n">Tableau</span>
</span><span id="L-88"><a href="#L-88"><span class="linenos">88</span></a><span class="kn">from</span><span class="w"> </span><span class="nn">sqlglot.dialects.teradata</span><span class="w"> </span><span class="kn">import</span> <span class="n">Teradata</span>
</span><span id="L-89"><a href="#L-89"><span class="linenos">89</span></a><span class="kn">from</span><span class="w"> </span><span class="nn">sqlglot.dialects.trino</span><span class="w"> </span><span class="kn">import</span> <span class="n">Trino</span>
</span><span id="L-90"><a href="#L-90"><span class="linenos">90</span></a><span class="kn">from</span><span class="w"> </span><span class="nn">sqlglot.dialects.tsql</span><span class="w"> </span><span class="kn">import</span> <span class="n">TSQL</span>
</span></pre></div>

File diff suppressed because one or more lines are too long

File diff suppressed because one or more lines are too long

File diff suppressed because one or more lines are too long

File diff suppressed because one or more lines are too long

File diff suppressed because one or more lines are too long

File diff suppressed because one or more lines are too long

File diff suppressed because one or more lines are too long

File diff suppressed because one or more lines are too long

File diff suppressed because one or more lines are too long

File diff suppressed because one or more lines are too long

File diff suppressed because one or more lines are too long

File diff suppressed because one or more lines are too long

File diff suppressed because one or more lines are too long

File diff suppressed because one or more lines are too long

File diff suppressed because one or more lines are too long

File diff suppressed because one or more lines are too long

File diff suppressed because one or more lines are too long

File diff suppressed because one or more lines are too long

File diff suppressed because one or more lines are too long

File diff suppressed because one or more lines are too long

File diff suppressed because one or more lines are too long

File diff suppressed because one or more lines are too long

File diff suppressed because one or more lines are too long

File diff suppressed because one or more lines are too long

File diff suppressed because one or more lines are too long

File diff suppressed because one or more lines are too long

File diff suppressed because one or more lines are too long

View file

@ -2709,6 +2709,7 @@ Default: True</li>
<dd id="Python.Generator.log_sql" class="function"><a href="../generator.html#Generator.log_sql">log_sql</a></dd>
<dd id="Python.Generator.use_sql" class="function"><a href="../generator.html#Generator.use_sql">use_sql</a></dd>
<dd id="Python.Generator.binary" class="function"><a href="../generator.html#Generator.binary">binary</a></dd>
<dd id="Python.Generator.ceil_floor" class="function"><a href="../generator.html#Generator.ceil_floor">ceil_floor</a></dd>
<dd id="Python.Generator.function_fallback_sql" class="function"><a href="../generator.html#Generator.function_fallback_sql">function_fallback_sql</a></dd>
<dd id="Python.Generator.func" class="function"><a href="../generator.html#Generator.func">func</a></dd>
<dd id="Python.Generator.format_args" class="function"><a href="../generator.html#Generator.format_args">format_args</a></dd>
@ -2807,6 +2808,13 @@ Default: True</li>
<dd id="Python.Generator.partitionbyrangeproperty_sql" class="function"><a href="../generator.html#Generator.partitionbyrangeproperty_sql">partitionbyrangeproperty_sql</a></dd>
<dd id="Python.Generator.partitionbyrangepropertydynamic_sql" class="function"><a href="../generator.html#Generator.partitionbyrangepropertydynamic_sql">partitionbyrangepropertydynamic_sql</a></dd>
<dd id="Python.Generator.unpivotcolumns_sql" class="function"><a href="../generator.html#Generator.unpivotcolumns_sql">unpivotcolumns_sql</a></dd>
<dd id="Python.Generator.analyzesample_sql" class="function"><a href="../generator.html#Generator.analyzesample_sql">analyzesample_sql</a></dd>
<dd id="Python.Generator.analyzestatistics_sql" class="function"><a href="../generator.html#Generator.analyzestatistics_sql">analyzestatistics_sql</a></dd>
<dd id="Python.Generator.analyzehistogram_sql" class="function"><a href="../generator.html#Generator.analyzehistogram_sql">analyzehistogram_sql</a></dd>
<dd id="Python.Generator.analyzedelete_sql" class="function"><a href="../generator.html#Generator.analyzedelete_sql">analyzedelete_sql</a></dd>
<dd id="Python.Generator.analyzelistchainedrows_sql" class="function"><a href="../generator.html#Generator.analyzelistchainedrows_sql">analyzelistchainedrows_sql</a></dd>
<dd id="Python.Generator.analyzevalidate_sql" class="function"><a href="../generator.html#Generator.analyzevalidate_sql">analyzevalidate_sql</a></dd>
<dd id="Python.Generator.analyze_sql" class="function"><a href="../generator.html#Generator.analyze_sql">analyze_sql</a></dd>
<dd id="Python.Generator.xmltable_sql" class="function"><a href="../generator.html#Generator.xmltable_sql">xmltable_sql</a></dd>
</div>

File diff suppressed because it is too large Load diff

File diff suppressed because one or more lines are too long

View file

@ -1920,7 +1920,7 @@ belong to some totally-ordered set.</p>
<section id="DATE_UNITS">
<div class="attr variable">
<span class="name">DATE_UNITS</span> =
<span class="default_value">{&#39;month&#39;, &#39;year_month&#39;, &#39;week&#39;, &#39;year&#39;, &#39;quarter&#39;, &#39;day&#39;}</span>
<span class="default_value">{&#39;day&#39;, &#39;week&#39;, &#39;quarter&#39;, &#39;month&#39;, &#39;year&#39;, &#39;year_month&#39;}</span>
</div>

View file

@ -641,7 +641,7 @@
<div class="attr variable">
<span class="name">ALL_JSON_PATH_PARTS</span> =
<input id="ALL_JSON_PATH_PARTS-view-value" class="view-value-toggle-state" type="checkbox" aria-hidden="true" tabindex="-1">
<label class="view-value-button pdoc-button" for="ALL_JSON_PATH_PARTS-view-value"></label><span class="default_value">{&lt;class &#39;<a href="expressions.html#JSONPathRecursive">sqlglot.expressions.JSONPathRecursive</a>&#39;&gt;, &lt;class &#39;<a href="expressions.html#JSONPathKey">sqlglot.expressions.JSONPathKey</a>&#39;&gt;, &lt;class &#39;<a href="expressions.html#JSONPathWildcard">sqlglot.expressions.JSONPathWildcard</a>&#39;&gt;, &lt;class &#39;<a href="expressions.html#JSONPathFilter">sqlglot.expressions.JSONPathFilter</a>&#39;&gt;, &lt;class &#39;<a href="expressions.html#JSONPathUnion">sqlglot.expressions.JSONPathUnion</a>&#39;&gt;, &lt;class &#39;<a href="expressions.html#JSONPathSubscript">sqlglot.expressions.JSONPathSubscript</a>&#39;&gt;, &lt;class &#39;<a href="expressions.html#JSONPathSelector">sqlglot.expressions.JSONPathSelector</a>&#39;&gt;, &lt;class &#39;<a href="expressions.html#JSONPathSlice">sqlglot.expressions.JSONPathSlice</a>&#39;&gt;, &lt;class &#39;<a href="expressions.html#JSONPathScript">sqlglot.expressions.JSONPathScript</a>&#39;&gt;, &lt;class &#39;<a href="expressions.html#JSONPathRoot">sqlglot.expressions.JSONPathRoot</a>&#39;&gt;}</span>
<label class="view-value-button pdoc-button" for="ALL_JSON_PATH_PARTS-view-value"></label><span class="default_value">{&lt;class &#39;<a href="expressions.html#JSONPathSlice">sqlglot.expressions.JSONPathSlice</a>&#39;&gt;, &lt;class &#39;<a href="expressions.html#JSONPathScript">sqlglot.expressions.JSONPathScript</a>&#39;&gt;, &lt;class &#39;<a href="expressions.html#JSONPathRoot">sqlglot.expressions.JSONPathRoot</a>&#39;&gt;, &lt;class &#39;<a href="expressions.html#JSONPathRecursive">sqlglot.expressions.JSONPathRecursive</a>&#39;&gt;, &lt;class &#39;<a href="expressions.html#JSONPathKey">sqlglot.expressions.JSONPathKey</a>&#39;&gt;, &lt;class &#39;<a href="expressions.html#JSONPathWildcard">sqlglot.expressions.JSONPathWildcard</a>&#39;&gt;, &lt;class &#39;<a href="expressions.html#JSONPathFilter">sqlglot.expressions.JSONPathFilter</a>&#39;&gt;, &lt;class &#39;<a href="expressions.html#JSONPathUnion">sqlglot.expressions.JSONPathUnion</a>&#39;&gt;, &lt;class &#39;<a href="expressions.html#JSONPathSubscript">sqlglot.expressions.JSONPathSubscript</a>&#39;&gt;, &lt;class &#39;<a href="expressions.html#JSONPathSelector">sqlglot.expressions.JSONPathSelector</a>&#39;&gt;}</span>
</div>

File diff suppressed because one or more lines are too long

View file

@ -581,7 +581,7 @@ queries if it would result in multiple table selects in a single query:</p>
<div class="attr variable">
<span class="name">UNMERGABLE_ARGS</span> =
<input id="UNMERGABLE_ARGS-view-value" class="view-value-toggle-state" type="checkbox" aria-hidden="true" tabindex="-1">
<label class="view-value-button pdoc-button" for="UNMERGABLE_ARGS-view-value"></label><span class="default_value">{&#39;limit&#39;, &#39;connect&#39;, &#39;format&#39;, &#39;options&#39;, &#39;sample&#39;, &#39;match&#39;, &#39;with&#39;, &#39;prewhere&#39;, &#39;settings&#39;, &#39;having&#39;, &#39;kind&#39;, &#39;pivots&#39;, &#39;offset&#39;, &#39;group&#39;, &#39;operation_modifiers&#39;, &#39;sort&#39;, &#39;cluster&#39;, &#39;distinct&#39;, &#39;locks&#39;, &#39;laterals&#39;, &#39;windows&#39;, &#39;qualify&#39;, &#39;distribute&#39;, &#39;into&#39;}</span>
<label class="view-value-button pdoc-button" for="UNMERGABLE_ARGS-view-value"></label><span class="default_value">{&#39;operation_modifiers&#39;, &#39;cluster&#39;, &#39;kind&#39;, &#39;limit&#39;, &#39;sort&#39;, &#39;laterals&#39;, &#39;options&#39;, &#39;distinct&#39;, &#39;format&#39;, &#39;with&#39;, &#39;settings&#39;, &#39;connect&#39;, &#39;match&#39;, &#39;qualify&#39;, &#39;prewhere&#39;, &#39;group&#39;, &#39;locks&#39;, &#39;offset&#39;, &#39;sample&#39;, &#39;into&#39;, &#39;distribute&#39;, &#39;windows&#39;, &#39;pivots&#39;, &#39;having&#39;}</span>
</div>

File diff suppressed because it is too large Load diff

View file

@ -3238,7 +3238,7 @@ prefix are statically known.</p>
<div class="attr variable">
<span class="name">DATETRUNC_COMPARISONS</span> =
<input id="DATETRUNC_COMPARISONS-view-value" class="view-value-toggle-state" type="checkbox" aria-hidden="true" tabindex="-1">
<label class="view-value-button pdoc-button" for="DATETRUNC_COMPARISONS-view-value"></label><span class="default_value">{&lt;class &#39;<a href="../expressions.html#GT">sqlglot.expressions.GT</a>&#39;&gt;, &lt;class &#39;<a href="../expressions.html#LT">sqlglot.expressions.LT</a>&#39;&gt;, &lt;class &#39;<a href="../expressions.html#NEQ">sqlglot.expressions.NEQ</a>&#39;&gt;, &lt;class &#39;<a href="../expressions.html#EQ">sqlglot.expressions.EQ</a>&#39;&gt;, &lt;class &#39;<a href="../expressions.html#In">sqlglot.expressions.In</a>&#39;&gt;, &lt;class &#39;<a href="../expressions.html#GTE">sqlglot.expressions.GTE</a>&#39;&gt;, &lt;class &#39;<a href="../expressions.html#LTE">sqlglot.expressions.LTE</a>&#39;&gt;}</span>
<label class="view-value-button pdoc-button" for="DATETRUNC_COMPARISONS-view-value"></label><span class="default_value">{&lt;class &#39;<a href="../expressions.html#LTE">sqlglot.expressions.LTE</a>&#39;&gt;, &lt;class &#39;<a href="../expressions.html#GT">sqlglot.expressions.GT</a>&#39;&gt;, &lt;class &#39;<a href="../expressions.html#LT">sqlglot.expressions.LT</a>&#39;&gt;, &lt;class &#39;<a href="../expressions.html#NEQ">sqlglot.expressions.NEQ</a>&#39;&gt;, &lt;class &#39;<a href="../expressions.html#EQ">sqlglot.expressions.EQ</a>&#39;&gt;, &lt;class &#39;<a href="../expressions.html#In">sqlglot.expressions.In</a>&#39;&gt;, &lt;class &#39;<a href="../expressions.html#GTE">sqlglot.expressions.GTE</a>&#39;&gt;}</span>
</div>
@ -3322,7 +3322,7 @@ prefix are statically known.</p>
<section id="JOINS">
<div class="attr variable">
<span class="name">JOINS</span> =
<span class="default_value">{(&#39;RIGHT&#39;, &#39;OUTER&#39;), (&#39;RIGHT&#39;, &#39;&#39;), (&#39;&#39;, &#39;INNER&#39;), (&#39;&#39;, &#39;&#39;)}</span>
<span class="default_value">{(&#39;RIGHT&#39;, &#39;&#39;), (&#39;&#39;, &#39;INNER&#39;), (&#39;RIGHT&#39;, &#39;OUTER&#39;), (&#39;&#39;, &#39;&#39;)}</span>
</div>

File diff suppressed because one or more lines are too long

File diff suppressed because one or more lines are too long

File diff suppressed because one or more lines are too long

View file

@ -68,6 +68,7 @@ from sqlglot.dialects.databricks import Databricks
from sqlglot.dialects.dialect import Dialect, Dialects
from sqlglot.dialects.doris import Doris
from sqlglot.dialects.drill import Drill
from sqlglot.dialects.druid import Druid
from sqlglot.dialects.duckdb import DuckDB
from sqlglot.dialects.hive import Hive
from sqlglot.dialects.materialize import Materialize

View file

@ -245,6 +245,7 @@ class ClickHouse(Dialect):
# * select x from t1 union all (select x from t2 limit 1);
MODIFIERS_ATTACHED_TO_SET_OP = False
INTERVAL_SPANS = False
OPTIONAL_ALIAS_TOKEN_CTE = False
FUNCTIONS = {
**parser.Parser.FUNCTIONS,

View file

@ -62,6 +62,7 @@ class Dialects(str, Enum):
DATABRICKS = "databricks"
DORIS = "doris"
DRILL = "drill"
DRUID = "druid"
DUCKDB = "duckdb"
HIVE = "hive"
MATERIALIZE = "materialize"
@ -1089,7 +1090,19 @@ def str_position_sql(
this = self.func("SUBSTR", this, position)
position_offset = f" + {position} - 1"
return self.func(str_position_func_name, this, substr, instance) + position_offset
strpos_sql = self.func(str_position_func_name, this, substr, instance)
if position_offset:
zero = exp.Literal.number(0)
# If match is not found (returns 0) the position offset should not be applied
case = exp.If(
this=exp.EQ(this=strpos_sql, expression=zero),
true=zero,
false=strpos_sql + position_offset,
)
strpos_sql = self.sql(case)
return strpos_sql
def struct_extract_sql(self: Generator, expression: exp.StructExtract) -> str:
@ -1557,19 +1570,25 @@ def merge_without_target_sql(self: Generator, expression: exp.Merge) -> str:
targets.add(normalize(alias.this))
for when in expression.args["whens"].expressions:
# only remove the target names from the THEN clause
# theyre still valid in the <condition> part of WHEN MATCHED / WHEN NOT MATCHED
# ref: https://github.com/TobikoData/sqlmesh/issues/2934
then = when.args.get("then")
# only remove the target table names from certain parts of WHEN MATCHED / WHEN NOT MATCHED
# they are still valid in the <condition>, the right hand side of each UPDATE and the VALUES part
# (not the column list) of the INSERT
then: exp.Insert | exp.Update | None = when.args.get("then")
if then:
then.transform(
lambda node: (
exp.column(node.this)
if isinstance(node, exp.Column) and normalize(node.args.get("table")) in targets
else node
),
copy=False,
)
if isinstance(then, exp.Update):
for equals in then.find_all(exp.EQ):
equal_lhs = equals.this
if (
isinstance(equal_lhs, exp.Column)
and normalize(equal_lhs.args.get("table")) in targets
):
equal_lhs.replace(exp.column(equal_lhs.this))
if isinstance(then, exp.Insert):
column_list = then.this
if isinstance(column_list, exp.Tuple):
for column in column_list.expressions:
if normalize(column.args.get("table")) in targets:
column.replace(exp.column(column.this))
return self.merge_sql(expression)

14
sqlglot/dialects/druid.py Normal file
View file

@ -0,0 +1,14 @@
from sqlglot import exp, generator
from sqlglot.dialects.dialect import Dialect
class Druid(Dialect):
class Generator(generator.Generator):
# https://druid.apache.org/docs/latest/querying/sql-data-types/
TYPE_MAPPING = {
**generator.Generator.TYPE_MAPPING,
exp.DataType.Type.NCHAR: "STRING",
exp.DataType.Type.NVARCHAR: "STRING",
exp.DataType.Type.TEXT: "STRING",
exp.DataType.Type.UUID: "STRING",
}

View file

@ -291,6 +291,8 @@ class Oracle(Dialect):
exp.DateTrunc: lambda self, e: self.func("TRUNC", e.this, e.unit),
exp.Group: transforms.preprocess([transforms.unalias_group]),
exp.ILike: no_ilike_sql,
exp.LogicalOr: rename_func("MAX"),
exp.LogicalAnd: rename_func("MIN"),
exp.Mod: rename_func("MOD"),
exp.Select: transforms.preprocess(
[

View file

@ -32,6 +32,7 @@ from sqlglot.dialects.dialect import (
timestrtotime_sql,
trim_sql,
ts_or_ds_add_cast,
str_position_sql,
)
from sqlglot.helper import is_int, seq_get
from sqlglot.parser import binary_range_parser
@ -583,8 +584,7 @@ class Postgres(Dialect):
]
),
exp.SHA2: sha256_sql,
exp.StrPosition: lambda self,
e: f"POSITION({self.sql(e, 'substr')} IN {self.sql(e, 'this')})",
exp.StrPosition: str_position_sql,
exp.StrToDate: lambda self, e: self.func("TO_DATE", e.this, self.format_time(e)),
exp.StrToTime: lambda self, e: self.func("TO_TIMESTAMP", e.this, self.format_time(e)),
exp.StructExtract: struct_extract_sql,

View file

@ -2,7 +2,7 @@ from __future__ import annotations
import typing as t
from sqlglot import exp, generator, parser, tokens, transforms
from sqlglot import exp, generator, jsonpath, parser, tokens, transforms
from sqlglot.dialects.dialect import (
Dialect,
NormalizationStrategy,
@ -375,6 +375,10 @@ class Snowflake(Dialect):
return super().quote_identifier(expression, identify=identify)
class JSONPathTokenizer(jsonpath.JSONPathTokenizer):
SINGLE_TOKENS = jsonpath.JSONPathTokenizer.SINGLE_TOKENS.copy()
SINGLE_TOKENS.pop("$")
class Parser(parser.Parser):
IDENTIFY_PIVOT_STRINGS = True
DEFAULT_SAMPLING_METHOD = "BERNOULLI"

View file

@ -102,8 +102,6 @@ class Spark(Spark2):
]
class Parser(Spark2.Parser):
OPTIONAL_ALIAS_TOKEN_CTE = True
FUNCTIONS = {
**Spark2.Parser.FUNCTIONS,
"ANY_VALUE": _build_with_ignore_nulls(exp.AnyValue),

View file

@ -1271,4 +1271,4 @@ class TSQL(Dialect):
)
def isascii_sql(self, expression: exp.IsAscii) -> str:
return f"(PATINDEX('%[^' + CHAR(0x00) + '-' + CHAR(0x7f) + ']%' COLLATE Latin1_General_BIN, {self.sql(expression.this)}) = 0)"
return f"(PATINDEX(CONVERT(VARCHAR(MAX), 0x255b5e002d7f5d25) COLLATE Latin1_General_BIN, {self.sql(expression.this)}) = 0)"

View file

@ -1224,6 +1224,7 @@ class Query(Expression):
append: bool = True,
dialect: DialectType = None,
copy: bool = True,
scalar: bool = False,
**opts,
) -> Q:
"""
@ -1244,6 +1245,7 @@ class Query(Expression):
Otherwise, this resets the expressions.
dialect: the dialect used to parse the input expression.
copy: if `False`, modify this expression instance in-place.
scalar: if `True`, this is a scalar common table expression.
opts: other options to use to parse the input expressions.
Returns:
@ -1258,6 +1260,7 @@ class Query(Expression):
append=append,
dialect=dialect,
copy=copy,
scalar=scalar,
**opts,
)
@ -2326,7 +2329,7 @@ class LoadData(Expression):
class Partition(Expression):
arg_types = {"expressions": True}
arg_types = {"expressions": True, "subpartition": False}
class PartitionRange(Expression):
@ -4713,6 +4716,68 @@ class Alter(Expression):
return self.args.get("actions") or []
class Analyze(Expression):
arg_types = {
"kind": False,
"this": False,
"options": False,
"mode": False,
"partition": False,
"expression": False,
"properties": False,
}
class AnalyzeStatistics(Expression):
arg_types = {
"kind": True,
"option": False,
"this": False,
"expressions": False,
}
class AnalyzeHistogram(Expression):
arg_types = {
"this": True,
"expressions": True,
"expression": False,
"update_options": False,
}
class AnalyzeSample(Expression):
arg_types = {"kind": True, "sample": True}
class AnalyzeListChainedRows(Expression):
arg_types = {"expression": False}
class AnalyzeDelete(Expression):
arg_types = {"kind": False}
class AnalyzeWith(Expression):
arg_types = {"expressions": True}
class AnalyzeValidate(Expression):
arg_types = {
"kind": True,
"this": False,
"expression": False,
}
class AnalyzeColumns(Expression):
pass
class UsingData(Expression):
pass
class AddConstraint(Expression):
arg_types = {"expressions": True}
@ -5494,7 +5559,7 @@ class Collate(Binary, Func):
class Ceil(Func):
arg_types = {"this": True, "decimals": False}
arg_types = {"this": True, "decimals": False, "to": False}
_sql_names = ["CEIL", "CEILING"]
@ -5809,7 +5874,7 @@ class Unnest(Func, UDTF):
class Floor(Func):
arg_types = {"this": True, "decimals": False}
arg_types = {"this": True, "decimals": False, "to": False}
class FromBase64(Func):
@ -7020,11 +7085,15 @@ def _apply_cte_builder(
append: bool = True,
dialect: DialectType = None,
copy: bool = True,
scalar: bool = False,
**opts,
) -> E:
alias_expression = maybe_parse(alias, dialect=dialect, into=TableAlias, **opts)
as_expression = maybe_parse(as_, dialect=dialect, **opts)
cte = CTE(this=as_expression, alias=alias_expression, materialized=materialized)
as_expression = maybe_parse(as_, dialect=dialect, copy=copy, **opts)
if scalar and not isinstance(as_expression, Subquery):
# scalar CTE must be wrapped in a subquery
as_expression = Subquery(this=as_expression)
cte = CTE(this=as_expression, alias=alias_expression, materialized=materialized, scalar=scalar)
return _apply_child_list_builder(
cte,
instance=instance,

View file

@ -114,12 +114,15 @@ class Generator(metaclass=_Generator):
**JSON_PATH_PART_TRANSFORMS,
exp.AllowedValuesProperty: lambda self,
e: f"ALLOWED_VALUES {self.expressions(e, flat=True)}",
exp.AnalyzeColumns: lambda self, e: self.sql(e, "this"),
exp.AnalyzeWith: lambda self, e: self.expressions(e, prefix="WITH ", sep=" "),
exp.ArrayContainsAll: lambda self, e: self.binary(e, "@>"),
exp.ArrayOverlaps: lambda self, e: self.binary(e, "&&"),
exp.AutoRefreshProperty: lambda self, e: f"AUTO REFRESH {self.sql(e, 'this')}",
exp.BackupProperty: lambda self, e: f"BACKUP {self.sql(e, 'this')}",
exp.CaseSpecificColumnConstraint: lambda _,
e: f"{'NOT ' if e.args.get('not_') else ''}CASESPECIFIC",
exp.Ceil: lambda self, e: self.ceil_floor(e),
exp.CharacterSetColumnConstraint: lambda self, e: f"CHARACTER SET {self.sql(e, 'this')}",
exp.CharacterSetProperty: lambda self,
e: f"{'DEFAULT ' if e.args.get('default') else ''}CHARACTER SET={self.sql(e, 'this')}",
@ -140,6 +143,7 @@ class Generator(metaclass=_Generator):
exp.ExecuteAsProperty: lambda self, e: self.naked_property(e),
exp.Except: lambda self, e: self.set_operations(e),
exp.ExternalProperty: lambda *_: "EXTERNAL",
exp.Floor: lambda self, e: self.ceil_floor(e),
exp.GlobalProperty: lambda *_: "GLOBAL",
exp.HeapProperty: lambda *_: "HEAP",
exp.IcebergProperty: lambda *_: "ICEBERG",
@ -196,6 +200,7 @@ class Generator(metaclass=_Generator):
exp.TransientProperty: lambda *_: "TRANSIENT",
exp.Union: lambda self, e: self.set_operations(e),
exp.UnloggedProperty: lambda *_: "UNLOGGED",
exp.UsingData: lambda self, e: f"USING DATA {self.sql(e, 'this')}",
exp.Uuid: lambda *_: "UUID()",
exp.UppercaseColumnConstraint: lambda *_: "UPPERCASE",
exp.VarMap: lambda self, e: self.func("MAP", e.args["keys"], e.args["values"]),
@ -1556,7 +1561,8 @@ class Generator(metaclass=_Generator):
return f"{prefix}{string}"
def partition_sql(self, expression: exp.Partition) -> str:
return f"PARTITION({self.expressions(expression, flat=True)})"
partition_keyword = "SUBPARTITION" if expression.args.get("subpartition") else "PARTITION"
return f"{partition_keyword}({self.expressions(expression, flat=True)})"
def properties_sql(self, expression: exp.Properties) -> str:
root_properties = []
@ -3532,6 +3538,13 @@ class Generator(metaclass=_Generator):
return "".join(sqls)
def ceil_floor(self, expression: exp.Ceil | exp.Floor) -> str:
to_clause = self.sql(expression, "to")
if to_clause:
return f"{expression.sql_name()}({self.sql(expression, 'this')} TO {to_clause})"
return self.function_fallback_sql(expression)
def function_fallback_sql(self, expression: exp.Func) -> str:
args = []
@ -4647,6 +4660,63 @@ class Generator(metaclass=_Generator):
return f"NAME {name} VALUE {values}"
def analyzesample_sql(self, expression: exp.AnalyzeSample) -> str:
kind = self.sql(expression, "kind")
sample = self.sql(expression, "sample")
return f"SAMPLE {sample} {kind}"
def analyzestatistics_sql(self, expression: exp.AnalyzeStatistics) -> str:
kind = self.sql(expression, "kind")
option = self.sql(expression, "option")
option = f" {option}" if option else ""
this = self.sql(expression, "this")
this = f" {this}" if this else ""
columns = self.expressions(expression)
columns = f" {columns}" if columns else ""
return f"{kind}{option} STATISTICS{this}{columns}"
def analyzehistogram_sql(self, expression: exp.AnalyzeHistogram) -> str:
this = self.sql(expression, "this")
columns = self.expressions(expression)
inner_expression = self.sql(expression, "expression")
inner_expression = f" {inner_expression}" if inner_expression else ""
update_options = self.sql(expression, "update_options")
update_options = f" {update_options} UPDATE" if update_options else ""
return f"{this} HISTOGRAM ON {columns}{inner_expression}{update_options}"
def analyzedelete_sql(self, expression: exp.AnalyzeDelete) -> str:
kind = self.sql(expression, "kind")
kind = f" {kind}" if kind else ""
return f"DELETE{kind} STATISTICS"
def analyzelistchainedrows_sql(self, expression: exp.AnalyzeListChainedRows) -> str:
inner_expression = self.sql(expression, "expression")
return f"LIST CHAINED ROWS{inner_expression}"
def analyzevalidate_sql(self, expression: exp.AnalyzeValidate) -> str:
kind = self.sql(expression, "kind")
this = self.sql(expression, "this")
this = f" {this}" if this else ""
inner_expression = self.sql(expression, "expression")
return f"VALIDATE {kind}{this}{inner_expression}"
def analyze_sql(self, expression: exp.Analyze) -> str:
options = self.expressions(expression, key="options", sep=" ")
options = f" {options}" if options else ""
kind = self.sql(expression, "kind")
kind = f" {kind}" if kind else ""
this = self.sql(expression, "this")
this = f" {this}" if this else ""
mode = self.sql(expression, "mode")
mode = f" {mode}" if mode else ""
properties = self.sql(expression, "properties")
properties = f" {properties}" if properties else ""
partition = self.sql(expression, "partition")
partition = f" {partition}" if partition else ""
inner_expression = self.sql(expression, "expression")
inner_expression = f" {inner_expression}" if inner_expression else ""
return f"ANALYZE{options}{kind}{this}{partition}{mode}{inner_expression}{properties}"
def xmltable_sql(self, expression: exp.XMLTable) -> str:
this = self.sql(expression, "this")
passing = self.expressions(expression, key="passing")

View file

@ -146,9 +146,7 @@ class Scope:
self._udtfs.append(node)
elif isinstance(node, exp.CTE):
self._ctes.append(node)
elif _is_derived_table(node) and isinstance(
node.parent, (exp.From, exp.Join, exp.Subquery)
):
elif _is_derived_table(node) and _is_from_or_join(node):
self._derived_tables.append(node)
elif isinstance(node, exp.UNWRAPPED_QUERIES):
self._subqueries.append(node)
@ -661,6 +659,19 @@ def _is_derived_table(expression: exp.Subquery) -> bool:
)
def _is_from_or_join(expression: exp.Expression) -> bool:
"""
Determine if `expression` is the FROM or JOIN clause of a SELECT statement.
"""
parent = expression.parent
# Subqueries can be arbitrarily nested
while isinstance(parent, exp.Subquery):
parent = parent.parent
return isinstance(parent, (exp.From, exp.Join))
def _traverse_tables(scope):
sources = {}

View file

@ -16,6 +16,7 @@ if t.TYPE_CHECKING:
from sqlglot.dialects.dialect import Dialect, DialectType
T = t.TypeVar("T")
TCeilFloor = t.TypeVar("TCeilFloor", exp.Ceil, exp.Floor)
logger = logging.getLogger("sqlglot")
@ -496,6 +497,7 @@ class Parser(metaclass=_Parser):
TokenType.KEEP,
TokenType.KILL,
TokenType.LEFT,
TokenType.LIMIT,
TokenType.LOAD,
TokenType.MERGE,
TokenType.NATURAL,
@ -551,7 +553,6 @@ class Parser(metaclass=_Parser):
TokenType.LEFT,
TokenType.LOCK,
TokenType.NATURAL,
TokenType.OFFSET,
TokenType.RIGHT,
TokenType.SEMI,
TokenType.WINDOW,
@ -788,6 +789,7 @@ class Parser(metaclass=_Parser):
STATEMENT_PARSERS = {
TokenType.ALTER: lambda self: self._parse_alter(),
TokenType.ANALYZE: lambda self: self._parse_analyze(),
TokenType.BEGIN: lambda self: self._parse_transaction(),
TokenType.CACHE: lambda self: self._parse_cache(),
TokenType.COMMENT: lambda self: self._parse_comment(),
@ -1126,9 +1128,11 @@ class Parser(metaclass=_Parser):
FUNCTION_PARSERS = {
"CAST": lambda self: self._parse_cast(self.STRICT_CAST),
"CEIL": lambda self: self._parse_ceil_floor(exp.Ceil),
"CONVERT": lambda self: self._parse_convert(self.STRICT_CAST),
"DECODE": lambda self: self._parse_decode(),
"EXTRACT": lambda self: self._parse_extract(),
"FLOOR": lambda self: self._parse_ceil_floor(exp.Floor),
"GAP_FILL": lambda self: self._parse_gap_fill(),
"JSON_OBJECT": lambda self: self._parse_json_object(),
"JSON_OBJECTAGG": lambda self: self._parse_json_object(agg=True),
@ -1325,6 +1329,33 @@ class Parser(metaclass=_Parser):
# The style options for the DESCRIBE statement
DESCRIBE_STYLES = {"ANALYZE", "EXTENDED", "FORMATTED", "HISTORY"}
# The style options for the ANALYZE statement
ANALYZE_STYLES = {
"BUFFER_USAGE_LIMIT",
"FULL",
"LOCAL",
"NO_WRITE_TO_BINLOG",
"SAMPLE",
"SKIP_LOCKED",
"VERBOSE",
}
ANALYZE_EXPRESSION_PARSERS = {
"ALL": lambda self: self._parse_analyze_columns(),
"COMPUTE": lambda self: self._parse_analyze_statistics(),
"DELETE": lambda self: self._parse_analyze_delete(),
"DROP": lambda self: self._parse_analyze_histogram(),
"ESTIMATE": lambda self: self._parse_analyze_statistics(),
"LIST": lambda self: self._parse_analyze_list(),
"PREDICATE": lambda self: self._parse_analyze_columns(),
"UPDATE": lambda self: self._parse_analyze_histogram(),
"VALIDATE": lambda self: self._parse_analyze_validate(),
}
PARTITION_KEYWORDS = {"PARTITION", "SUBPARTITION"}
AMBIGUOUS_ALIAS_TOKENS = (TokenType.LIMIT, TokenType.OFFSET)
OPERATION_MODIFIERS: t.Set[str] = set()
STRICT_CAST = True
@ -1382,7 +1413,7 @@ class Parser(metaclass=_Parser):
WRAPPED_TRANSFORM_COLUMN_CONSTRAINT = True
# Whether the 'AS' keyword is optional in the CTE definition syntax
OPTIONAL_ALIAS_TOKEN_CTE = False
OPTIONAL_ALIAS_TOKEN_CTE = True
__slots__ = (
"error_level",
@ -2955,11 +2986,13 @@ class Parser(metaclass=_Parser):
)
def _parse_partition(self) -> t.Optional[exp.Partition]:
if not self._match(TokenType.PARTITION):
if not self._match_texts(self.PARTITION_KEYWORDS):
return None
return self.expression(
exp.Partition, expressions=self._parse_wrapped_csv(self._parse_assignment)
exp.Partition,
subpartition=self._prev.text.upper() == "SUBPARTITION",
expressions=self._parse_wrapped_csv(self._parse_assignment),
)
def _parse_value(self) -> t.Optional[exp.Tuple]:
@ -3175,6 +3208,12 @@ class Parser(metaclass=_Parser):
def _parse_table_alias(
self, alias_tokens: t.Optional[t.Collection[TokenType]] = None
) -> t.Optional[exp.TableAlias]:
# In some dialects, LIMIT and OFFSET can act as both identifiers and keywords (clauses)
# so this section tries to parse the clause version and if it fails, it treats the token
# as an identifier (alias)
if self._can_parse_limit_or_offset():
return None
any_token = self._match(TokenType.ALIAS)
alias = (
self._parse_id_var(any_token=any_token, tokens=alias_tokens or self.TABLE_ALIAS_TOKENS)
@ -4424,6 +4463,18 @@ class Parser(metaclass=_Parser):
exp.Offset, this=this, expression=count, expressions=self._parse_limit_by()
)
def _can_parse_limit_or_offset(self) -> bool:
if not self._match_set(self.AMBIGUOUS_ALIAS_TOKENS, advance=False):
return False
index = self._index
result = bool(
self._try_parse(self._parse_limit, retreat=True)
or self._try_parse(self._parse_offset, retreat=True)
)
self._retreat(index)
return result
def _parse_limit_by(self) -> t.Optional[t.List[exp.Expression]]:
return self._match_text_seq("BY") and self._parse_csv(self._parse_bitwise)
@ -6633,6 +6684,12 @@ class Parser(metaclass=_Parser):
def _parse_alias(
self, this: t.Optional[exp.Expression], explicit: bool = False
) -> t.Optional[exp.Expression]:
# In some dialects, LIMIT and OFFSET can act as both identifiers and keywords (clauses)
# so this section tries to parse the clause version and if it fails, it treats the token
# as an identifier (alias)
if self._can_parse_limit_or_offset():
return this
any_token = self._match(TokenType.ALIAS)
comments = self._prev_comments or []
@ -7079,6 +7136,180 @@ class Parser(metaclass=_Parser):
return self._parse_as_command(start)
def _parse_analyze(self) -> exp.Analyze | exp.Command:
start = self._prev
# https://duckdb.org/docs/sql/statements/analyze
if not self._curr:
return self.expression(exp.Analyze)
options = []
while self._match_texts(self.ANALYZE_STYLES):
if self._prev.text.upper() == "BUFFER_USAGE_LIMIT":
options.append(f"BUFFER_USAGE_LIMIT {self._parse_number()}")
else:
options.append(self._prev.text.upper())
this: t.Optional[exp.Expression] = None
inner_expression: t.Optional[exp.Expression] = None
kind = self._curr and self._curr.text.upper()
if self._match(TokenType.TABLE) or self._match(TokenType.INDEX):
this = self._parse_table_parts()
elif self._match_text_seq("TABLES"):
if self._match_set((TokenType.FROM, TokenType.IN)):
kind = f"{kind} {self._prev.text.upper()}"
this = self._parse_table(schema=True, is_db_reference=True)
elif self._match_text_seq("DATABASE"):
this = self._parse_table(schema=True, is_db_reference=True)
elif self._match_text_seq("CLUSTER"):
this = self._parse_table()
# Try matching inner expr keywords before fallback to parse table.
elif self._match_texts(self.ANALYZE_EXPRESSION_PARSERS):
kind = None
inner_expression = self.ANALYZE_EXPRESSION_PARSERS[self._prev.text.upper()](self)
else:
# Empty kind https://prestodb.io/docs/current/sql/analyze.html
kind = None
this = self._parse_table_parts()
partition = self._try_parse(self._parse_partition)
if not partition and self._match_texts(self.PARTITION_KEYWORDS):
return self._parse_as_command(start)
# https://docs.starrocks.io/docs/sql-reference/sql-statements/cbo_stats/ANALYZE_TABLE/
if self._match_text_seq("WITH", "SYNC", "MODE") or self._match_text_seq(
"WITH", "ASYNC", "MODE"
):
mode = f"WITH {self._tokens[self._index-2].text.upper()} MODE"
else:
mode = None
if self._match_texts(self.ANALYZE_EXPRESSION_PARSERS):
inner_expression = self.ANALYZE_EXPRESSION_PARSERS[self._prev.text.upper()](self)
properties = self._parse_properties()
return self.expression(
exp.Analyze,
kind=kind,
this=this,
mode=mode,
partition=partition,
properties=properties,
expression=inner_expression,
options=options,
)
# https://spark.apache.org/docs/3.5.1/sql-ref-syntax-aux-analyze-table.html
def _parse_analyze_statistics(self) -> exp.AnalyzeStatistics:
this = None
kind = self._prev.text.upper()
option = self._prev.text.upper() if self._match_text_seq("DELTA") else None
expressions = []
if not self._match_text_seq("STATISTICS"):
self.raise_error("Expecting token STATISTICS")
if self._match_text_seq("NOSCAN"):
this = "NOSCAN"
elif self._match(TokenType.FOR):
if self._match_text_seq("ALL", "COLUMNS"):
this = "FOR ALL COLUMNS"
if self._match_texts("COLUMNS"):
this = "FOR COLUMNS"
expressions = self._parse_csv(self._parse_column_reference)
elif self._match_text_seq("SAMPLE"):
sample = self._parse_number()
expressions = [
self.expression(
exp.AnalyzeSample,
sample=sample,
kind=self._prev.text.upper() if self._match(TokenType.PERCENT) else None,
)
]
return self.expression(
exp.AnalyzeStatistics, kind=kind, option=option, this=this, expressions=expressions
)
# https://docs.oracle.com/en/database/oracle/oracle-database/21/sqlrf/ANALYZE.html
def _parse_analyze_validate(self) -> exp.AnalyzeValidate:
kind = None
this = None
expression: t.Optional[exp.Expression] = None
if self._match_text_seq("REF", "UPDATE"):
kind = "REF"
this = "UPDATE"
if self._match_text_seq("SET", "DANGLING", "TO", "NULL"):
this = "UPDATE SET DANGLING TO NULL"
elif self._match_text_seq("STRUCTURE"):
kind = "STRUCTURE"
if self._match_text_seq("CASCADE", "FAST"):
this = "CASCADE FAST"
elif self._match_text_seq("CASCADE", "COMPLETE") and self._match_texts(
("ONLINE", "OFFLINE")
):
this = f"CASCADE COMPLETE {self._prev.text.upper()}"
expression = self._parse_into()
return self.expression(exp.AnalyzeValidate, kind=kind, this=this, expression=expression)
def _parse_analyze_columns(self) -> t.Optional[exp.AnalyzeColumns]:
this = self._prev.text.upper()
if self._match_text_seq("COLUMNS"):
return self.expression(exp.AnalyzeColumns, this=f"{this} {self._prev.text.upper()}")
return None
def _parse_analyze_delete(self) -> t.Optional[exp.AnalyzeDelete]:
kind = self._prev.text.upper() if self._match_text_seq("SYSTEM") else None
if self._match_text_seq("STATISTICS"):
return self.expression(exp.AnalyzeDelete, kind=kind)
return None
def _parse_analyze_list(self) -> t.Optional[exp.AnalyzeListChainedRows]:
if self._match_text_seq("CHAINED", "ROWS"):
return self.expression(exp.AnalyzeListChainedRows, expression=self._parse_into())
return None
# https://dev.mysql.com/doc/refman/8.4/en/analyze-table.html
def _parse_analyze_histogram(self) -> exp.AnalyzeHistogram:
this = self._prev.text.upper()
expression: t.Optional[exp.Expression] = None
expressions = []
update_options = None
if self._match_text_seq("HISTOGRAM", "ON"):
expressions = self._parse_csv(self._parse_column_reference)
with_expressions = []
while self._match(TokenType.WITH):
# https://docs.starrocks.io/docs/sql-reference/sql-statements/cbo_stats/ANALYZE_TABLE/
if self._match_texts(("SYNC", "ASYNC")):
if self._match_text_seq("MODE", advance=False):
with_expressions.append(f"{self._prev.text.upper()} MODE")
self._advance()
else:
buckets = self._parse_number()
if self._match_text_seq("BUCKETS"):
with_expressions.append(f"{buckets} BUCKETS")
if with_expressions:
expression = self.expression(exp.AnalyzeWith, expressions=with_expressions)
if self._match_texts(("MANUAL", "AUTO")) and self._match(
TokenType.UPDATE, advance=False
):
update_options = self._prev.text.upper()
self._advance()
elif self._match_text_seq("USING", "DATA"):
expression = self.expression(exp.UsingData, this=self._parse_string())
return self.expression(
exp.AnalyzeHistogram,
this=this,
expressions=expressions,
expression=expression,
update_options=update_options,
)
def _parse_merge(self) -> exp.Merge:
self._match(TokenType.INTO)
target = self._parse_table()
@ -7640,6 +7871,16 @@ class Parser(metaclass=_Parser):
form=self._match(TokenType.COMMA) and self._parse_var(),
)
def _parse_ceil_floor(self, expr_type: t.Type[TCeilFloor]) -> TCeilFloor:
args = self._parse_csv(lambda: self._parse_lambda())
this = seq_get(args, 0)
decimals = seq_get(args, 1)
return expr_type(
this=this, decimals=decimals, to=self._match_text_seq("TO") and self._parse_var()
)
def _parse_star_ops(self) -> t.Optional[exp.Expression]:
if self._match_text_seq("COLUMNS", "(", advance=False):
this = self._parse_function()

View file

@ -410,6 +410,7 @@ class TokenType(AutoName):
OPTION = auto()
SINK = auto()
SOURCE = auto()
ANALYZE = auto()
NAMESPACE = auto()
@ -938,7 +939,7 @@ class Tokenizer(metaclass=_Tokenizer):
"SEQUENCE": TokenType.SEQUENCE,
"VARIANT": TokenType.VARIANT,
"ALTER": TokenType.ALTER,
"ANALYZE": TokenType.COMMAND,
"ANALYZE": TokenType.ANALYZE,
"CALL": TokenType.COMMAND,
"COMMENT": TokenType.COMMENT,
"EXPLAIN": TokenType.COMMAND,

9
sqlglotrs/Cargo.lock generated
View file

@ -448,6 +448,12 @@ version = "0.8.5"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "2b15c43186be67a4fd63bee50d0303afffcef381492ebe2c5d87f324e1b8815c"
[[package]]
name = "rustc-hash"
version = "2.1.0"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "c7fb8039b3032c191086b10f11f319a6e99e1e82889c5cc6046f515c9db1d497"
[[package]]
name = "ryu"
version = "1.0.18"
@ -497,10 +503,11 @@ dependencies = [
[[package]]
name = "sqlglotrs"
version = "0.3.4"
version = "0.3.5"
dependencies = [
"criterion",
"pyo3",
"rustc-hash",
"serde",
"serde_json",
"sqlglotrs",

View file

@ -1,12 +1,13 @@
[package]
name = "sqlglotrs"
version = "0.3.4"
version = "0.3.5"
edition = "2021"
license = "MIT"
[lib]
name = "sqlglotrs"
crate-type = ["cdylib", "rlib"]
bench = false
[[bench]]
name = "long"
@ -19,6 +20,7 @@ profiling = ["serde", "serde_json"]
[dependencies]
pyo3 = {version ="0.22.6", features = ["auto-initialize"]}
rustc-hash = { version = "2.1" }
# Optional dependencies used for profiling
serde = { version = "1", features = ["derive"] , optional = true }

View file

@ -1,6 +1,6 @@
use std::collections::{HashMap, HashSet};
use pyo3::prelude::*;
use rustc_hash::FxHashMap as HashMap;
use rustc_hash::FxHashSet as HashSet;
pub type TokenType = u16;

View file

@ -49,13 +49,11 @@ impl Token {
pub fn append_comments(&self, comments: &mut Vec<String>) {
Python::with_gil(|py| {
let pylist = self.comments.bind(py);
for comment in comments.iter() {
for comment in comments.drain(..) {
if let Err(_) = pylist.append(comment) {
panic!("Failed to append comments to the Python list");
}
}
});
// Simulate `Vec::append`.
let _ = std::mem::replace(comments, Vec::new());
}
}

View file

@ -23,14 +23,11 @@ pub struct Tokenizer {
impl Tokenizer {
#[new]
pub fn new(settings: TokenizerSettings, token_types: TokenTypeSettings) -> Tokenizer {
let mut keyword_trie = Trie::new();
let single_token_strs: Vec<String> = settings
.single_tokens
.keys()
.map(|s| s.to_string())
.collect();
let trie_filter =
|key: &&String| key.contains(" ") || single_token_strs.iter().any(|t| key.contains(t));
let mut keyword_trie = Trie::default();
let trie_filter = |key: &&String| {
key.contains(" ") || settings.single_tokens.keys().any(|&t| key.contains(t))
};
keyword_trie.add(settings.keywords.keys().filter(trie_filter));
keyword_trie.add(settings.comments.keys().filter(trie_filter));
@ -114,7 +111,7 @@ impl<'a> TokenizerState<'a> {
fn tokenize(&mut self) -> Result<Vec<Token>, TokenizerError> {
self.scan(None)?;
Ok(std::mem::replace(&mut self.tokens, Vec::new()))
Ok(std::mem::take(&mut self.tokens))
}
fn scan(&mut self, until_peek_char: Option<char>) -> Result<(), TokenizerError> {
@ -146,7 +143,7 @@ impl<'a> TokenizerState<'a> {
}
if !self.settings.white_space.contains_key(&self.current_char) {
if self.current_char.is_digit(10) {
if self.current_char.is_ascii_digit() {
self.scan_number()?;
} else if let Some(identifier_end) =
self.settings.identifiers.get(&self.current_char)
@ -205,7 +202,7 @@ impl<'a> TokenizerState<'a> {
}
fn char_at(&self, index: usize) -> Result<char, TokenizerError> {
self.sql.get(index).map(|c| *c).ok_or_else(|| {
self.sql.get(index).copied().ok_or_else(|| {
self.error(format!(
"Index {} is out of bound (size {})",
index, self.size
@ -237,7 +234,7 @@ impl<'a> TokenizerState<'a> {
self.column,
self.start,
self.current - 1,
std::mem::replace(&mut self.comments, Vec::new()),
std::mem::take(&mut self.comments),
));
// If we have either a semicolon or a begin token before the command's token, we'll parse
@ -503,7 +500,7 @@ impl<'a> TokenizerState<'a> {
let mut scientific = 0;
loop {
if self.peek_char.is_digit(10) {
if self.peek_char.is_ascii_digit() {
self.advance(1)?;
} else if self.peek_char == '.' && !decimal {
if self.tokens.last().map(|t| t.token_type) == Some(self.token_types.parameter) {
@ -537,8 +534,7 @@ impl<'a> TokenizerState<'a> {
.numeric_literals
.get(&literal.to_uppercase())
.unwrap_or(&String::from("")),
)
.map(|x| *x);
).copied();
let replaced = literal.replace("_", "");
@ -607,8 +603,7 @@ impl<'a> TokenizerState<'a> {
} else {
self.settings
.keywords
.get(&self.text().to_uppercase())
.map(|x| *x)
.get(&self.text().to_uppercase()).copied()
.unwrap_or(self.token_types.var)
};
self.add(token_type, None)
@ -718,13 +713,13 @@ impl<'a> TokenizerState<'a> {
if i == 0 {
self.is_alphabetic_or_underscore(c)
} else {
self.is_alphabetic_or_underscore(c) || c.is_digit(10)
self.is_alphabetic_or_underscore(c) || c.is_ascii_digit()
}
})
}
fn is_numeric(&mut self, s: &str) -> bool {
s.chars().all(|c| c.is_digit(10))
s.chars().all(|c| c.is_ascii_digit())
}
fn extract_value(&mut self) -> Result<String, TokenizerError> {

View file

@ -1,6 +1,6 @@
use std::collections::HashMap;
use rustc_hash::FxHashMap as HashMap;
#[derive(Debug)]
#[derive(Debug, Default)]
pub struct TrieNode {
is_word: bool,
children: HashMap<char, TrieNode>,
@ -35,21 +35,12 @@ impl TrieNode {
}
}
#[derive(Debug)]
#[derive(Debug, Default)]
pub struct Trie {
pub root: TrieNode,
}
impl Trie {
pub fn new() -> Self {
Trie {
root: TrieNode {
is_word: false,
children: HashMap::new(),
},
}
}
pub fn add<'a, I>(&mut self, keys: I)
where
I: Iterator<Item = &'a String>,
@ -59,7 +50,7 @@ impl Trie {
for c in key.chars() {
current = current.children.entry(c).or_insert(TrieNode {
is_word: false,
children: HashMap::new(),
children: HashMap::default(),
});
}
current.is_word = true;

View file

@ -291,3 +291,15 @@ class TestDatabricks(Validator):
self.validate_identity("GRANT SELECT ON TABLE sample_data TO `alf@melmak.et`")
self.validate_identity("GRANT ALL PRIVILEGES ON TABLE forecasts TO finance")
self.validate_identity("GRANT SELECT ON TABLE t TO `fab9e00e-ca35-11ec-9d64-0242ac120002`")
def test_analyze(self):
self.validate_identity("ANALYZE TABLE tbl COMPUTE DELTA STATISTICS NOSCAN")
self.validate_identity("ANALYZE TABLE tbl COMPUTE DELTA STATISTICS FOR ALL COLUMNS")
self.validate_identity("ANALYZE TABLE tbl COMPUTE DELTA STATISTICS FOR COLUMNS foo, bar")
self.validate_identity("ANALYZE TABLE ctlg.db.tbl COMPUTE DELTA STATISTICS NOSCAN")
self.validate_identity("ANALYZE TABLES COMPUTE STATISTICS NOSCAN")
self.validate_identity("ANALYZE TABLES FROM db COMPUTE STATISTICS")
self.validate_identity("ANALYZE TABLES IN db COMPUTE STATISTICS")
self.validate_identity(
"ANALYZE TABLE ctlg.db.tbl PARTITION(foo = 'foo', bar = 'bar') COMPUTE STATISTICS NOSCAN"
)

View file

@ -528,7 +528,7 @@ class TestDialect(Validator):
"sqlite": "SELECT (NOT x GLOB CAST(x'2a5b5e012d7f5d2a' AS TEXT))",
"mysql": "SELECT REGEXP_LIKE(x, '^[[:ascii:]]*$')",
"postgres": "SELECT (x ~ '^[[:ascii:]]*$')",
"tsql": "SELECT (PATINDEX('%[^' + CHAR(0x00) + '-' + CHAR(0x7f) + ']%' COLLATE Latin1_General_BIN, x) = 0)",
"tsql": "SELECT (PATINDEX(CONVERT(VARCHAR(MAX), 0x255b5e002d7f5d25) COLLATE Latin1_General_BIN, x) = 0)",
"oracle": "SELECT NVL(REGEXP_LIKE(x, '^[' || CHR(1) || '-' || CHR(127) || ']*$'), TRUE)",
},
)
@ -1686,7 +1686,7 @@ class TestDialect(Validator):
write={
"drill": "STRPOS(haystack, needle)",
"duckdb": "STRPOS(haystack, needle)",
"postgres": "POSITION(needle IN haystack)",
"postgres": "STRPOS(haystack, needle)",
"presto": "STRPOS(haystack, needle)",
"spark": "LOCATE(needle, haystack)",
"clickhouse": "position(haystack, needle)",
@ -1699,7 +1699,7 @@ class TestDialect(Validator):
write={
"drill": "STRPOS(haystack, needle)",
"duckdb": "STRPOS(haystack, needle)",
"postgres": "POSITION(needle IN haystack)",
"postgres": "STRPOS(haystack, needle)",
"presto": "STRPOS(haystack, needle)",
"bigquery": "STRPOS(haystack, needle)",
"spark": "LOCATE(needle, haystack)",
@ -1711,8 +1711,9 @@ class TestDialect(Validator):
self.validate_all(
"POSITION(needle, haystack, pos)",
write={
"drill": "STRPOS(SUBSTR(haystack, pos), needle) + pos - 1",
"presto": "STRPOS(SUBSTR(haystack, pos), needle) + pos - 1",
"drill": "`IF`(STRPOS(SUBSTR(haystack, pos), needle) = 0, 0, STRPOS(SUBSTR(haystack, pos), needle) + pos - 1)",
"presto": "IF(STRPOS(SUBSTR(haystack, pos), needle) = 0, 0, STRPOS(SUBSTR(haystack, pos), needle) + pos - 1)",
"postgres": "CASE WHEN STRPOS(SUBSTR(haystack, pos), needle) = 0 THEN 0 ELSE STRPOS(SUBSTR(haystack, pos), needle) + pos - 1 END",
"spark": "LOCATE(needle, haystack, pos)",
"clickhouse": "position(haystack, needle, pos)",
"snowflake": "POSITION(needle, haystack, pos)",
@ -2335,6 +2336,17 @@ SELECT
},
)
# needs to preserve the target alias in then WHEN condition and function but not in the THEN clause
self.validate_all(
"""MERGE INTO foo AS target USING (SELECT a, b FROM tbl) AS src ON src.a = target.a
WHEN MATCHED THEN UPDATE SET target.b = COALESCE(src.b, target.b)
WHEN NOT MATCHED THEN INSERT (target.a, target.b) VALUES (src.a, src.b)""",
write={
"trino": """MERGE INTO foo AS target USING (SELECT a, b FROM tbl) AS src ON src.a = target.a WHEN MATCHED THEN UPDATE SET b = COALESCE(src.b, target.b) WHEN NOT MATCHED THEN INSERT (a, b) VALUES (src.a, src.b)""",
"postgres": """MERGE INTO foo AS target USING (SELECT a, b FROM tbl) AS src ON src.a = target.a WHEN MATCHED THEN UPDATE SET b = COALESCE(src.b, target.b) WHEN NOT MATCHED THEN INSERT (a, b) VALUES (src.a, src.b)""",
},
)
def test_substring(self):
self.validate_all(
"SUBSTR('123456', 2, 3)",

View file

@ -100,3 +100,8 @@ class TestDoris(Validator):
"doris": "SELECT REGEXP(abc, '%foo%')",
},
)
def test_analyze(self):
self.validate_identity("ANALYZE TABLE tbl")
self.validate_identity("ANALYZE DATABASE db")
self.validate_identity("ANALYZE TABLE TBL(c1, c2)")

View file

@ -19,3 +19,7 @@ class TestDrill(Validator):
"mysql": "SELECT '2021-01-01' + INTERVAL '1' MONTH",
},
)
def test_analyze(self):
self.validate_identity("ANALYZE TABLE tbl COMPUTE STATISTICS")
self.validate_identity("ANALYZE TABLE tbl COMPUTE STATISTICS SAMPLE 5 PERCENT")

View file

@ -0,0 +1,21 @@
from sqlglot.dialects.dialect import Dialects
from tests.dialects.test_dialect import Validator
class TestDruid(Validator):
dialect = "druid"
def test_druid(self):
self.validate_identity("SELECT CEIL(__time TO WEEK) FROM t")
self.validate_identity("SELECT CEIL(col) FROM t")
self.validate_identity("SELECT CEIL(price, 2) AS rounded_price FROM t")
self.validate_identity("SELECT FLOOR(__time TO WEEK) FROM t")
self.validate_identity("SELECT FLOOR(col) FROM t")
self.validate_identity("SELECT FLOOR(price, 2) AS rounded_price FROM t")
# validate across all dialects
write = {dialect.value: "FLOOR(__time TO WEEK)" for dialect in Dialects}
self.validate_all(
"FLOOR(__time TO WEEK)",
write=write,
)

View file

@ -1448,3 +1448,6 @@ class TestDuckDB(Validator):
"WITH t1 AS (FROM (FROM t2 SELECT foo1, foo2)) FROM t1",
"WITH t1 AS (SELECT * FROM (SELECT foo1, foo2 FROM t2)) SELECT * FROM t1",
)
def test_analyze(self):
self.validate_identity("ANALYZE")

View file

@ -588,8 +588,8 @@ class TestHive(Validator):
self.validate_all(
"LOCATE('a', x, 3)",
write={
"duckdb": "STRPOS(SUBSTR(x, 3), 'a') + 3 - 1",
"presto": "STRPOS(SUBSTR(x, 3), 'a') + 3 - 1",
"duckdb": "CASE WHEN STRPOS(SUBSTR(x, 3), 'a') = 0 THEN 0 ELSE STRPOS(SUBSTR(x, 3), 'a') + 3 - 1 END",
"presto": "IF(STRPOS(SUBSTR(x, 3), 'a') = 0, 0, STRPOS(SUBSTR(x, 3), 'a') + 3 - 1)",
"hive": "LOCATE('a', x, 3)",
"spark": "LOCATE('a', x, 3)",
},

View file

@ -1378,3 +1378,13 @@ COMMENT='客户账户表'"""
"mysql": "SELECT FORMAT(12332.2, 2, 'de_DE')",
},
)
def test_analyze(self):
self.validate_identity("ANALYZE LOCAL TABLE tbl")
self.validate_identity("ANALYZE NO_WRITE_TO_BINLOG TABLE tbl")
self.validate_identity("ANALYZE tbl UPDATE HISTOGRAM ON col1")
self.validate_identity("ANALYZE tbl UPDATE HISTOGRAM ON col1 USING DATA 'json_data'")
self.validate_identity("ANALYZE tbl UPDATE HISTOGRAM ON col1 WITH 5 BUCKETS")
self.validate_identity("ANALYZE tbl UPDATE HISTOGRAM ON col1 WITH 5 BUCKETS AUTO UPDATE")
self.validate_identity("ANALYZE tbl UPDATE HISTOGRAM ON col1 WITH 5 BUCKETS MANUAL UPDATE")
self.validate_identity("ANALYZE tbl DROP HISTOGRAM ON col1")

View file

@ -654,3 +654,27 @@ WHERE
"'W'",
):
self.validate_identity(f"TRUNC(x, {unit})")
def test_analyze(self):
self.validate_identity("ANALYZE TABLE tbl")
self.validate_identity("ANALYZE INDEX ndx")
self.validate_identity("ANALYZE TABLE db.tbl PARTITION(foo = 'foo', bar = 'bar')")
self.validate_identity("ANALYZE TABLE db.tbl SUBPARTITION(foo = 'foo', bar = 'bar')")
self.validate_identity("ANALYZE INDEX db.ndx PARTITION(foo = 'foo', bar = 'bar')")
self.validate_identity("ANALYZE INDEX db.ndx PARTITION(part1)")
self.validate_identity("ANALYZE CLUSTER db.cluster")
self.validate_identity("ANALYZE TABLE tbl VALIDATE REF UPDATE")
self.validate_identity("ANALYZE LIST CHAINED ROWS")
self.validate_identity("ANALYZE LIST CHAINED ROWS INTO tbl")
self.validate_identity("ANALYZE DELETE STATISTICS")
self.validate_identity("ANALYZE DELETE SYSTEM STATISTICS")
self.validate_identity("ANALYZE VALIDATE REF UPDATE")
self.validate_identity("ANALYZE VALIDATE REF UPDATE SET DANGLING TO NULL")
self.validate_identity("ANALYZE VALIDATE STRUCTURE")
self.validate_identity("ANALYZE VALIDATE STRUCTURE CASCADE FAST")
self.validate_identity(
"ANALYZE TABLE tbl VALIDATE STRUCTURE CASCADE COMPLETE ONLINE INTO db.tbl"
)
self.validate_identity(
"ANALYZE TABLE tbl VALIDATE STRUCTURE CASCADE COMPLETE OFFLINE INTO db.tbl"
)

View file

@ -1316,3 +1316,9 @@ CROSS JOIN JSON_ARRAY_ELEMENTS(CAST(JSON_EXTRACT_PATH(tbox, 'boxes') AS JSON)) A
self.validate_identity(
"SELECT XMLELEMENT(NAME foo, XMLATTRIBUTES('xyz' AS bar), XMLELEMENT(NAME abc), XMLCOMMENT('test'), XMLELEMENT(NAME xyz))"
)
def test_analyze(self):
self.validate_identity("ANALYZE TBL")
self.validate_identity("ANALYZE TBL(col1, col2)")
self.validate_identity("ANALYZE VERBOSE SKIP_LOCKED TBL(col1, col2)")
self.validate_identity("ANALYZE BUFFER_USAGE_LIMIT 1337 TBL")

View file

@ -1296,3 +1296,7 @@ MATCH_RECOGNIZE (
# If the setting is overriden to False, then generate ROW access (dot notation)
self.assertEqual(s.sql(dialect_row_access_setting), 'SELECT col.x.y."special string"')
def test_analyze(self):
self.validate_identity("ANALYZE tbl")
self.validate_identity("ANALYZE tbl WITH (prop1=val1, prop2=val2)")

View file

@ -666,3 +666,9 @@ FROM (
self.validate_identity("GRANT USAGE ON DATABASE sales_db TO Bob")
self.validate_identity("GRANT USAGE ON SCHEMA sales_schema TO ROLE Analyst_role")
self.validate_identity("GRANT SELECT ON sales_db.sales_schema.tickit_sales_redshift TO Bob")
def test_analyze(self):
self.validate_identity("ANALYZE TBL(col1, col2)")
self.validate_identity("ANALYZE VERBOSE TBL")
self.validate_identity("ANALYZE TBL PREDICATE COLUMNS")
self.validate_identity("ANALYZE TBL ALL COLUMNS")

View file

@ -77,6 +77,7 @@ class TestSnowflake(Validator):
self.validate_identity("SELECT MATCH_CONDITION")
self.validate_identity("SELECT * REPLACE (CAST(col AS TEXT) AS scol) FROM t")
self.validate_identity("1 /* /* */")
self.validate_identity("TO_TIMESTAMP(col, fmt)")
self.validate_identity(
"SELECT * FROM table AT (TIMESTAMP => '2024-07-24') UNPIVOT(a FOR b IN (c)) AS pivot_table"
)
@ -104,7 +105,14 @@ class TestSnowflake(Validator):
self.validate_identity(
"SELECT * FROM DATA AS DATA_L ASOF JOIN DATA AS DATA_R MATCH_CONDITION (DATA_L.VAL > DATA_R.VAL) ON DATA_L.ID = DATA_R.ID"
)
self.validate_identity("TO_TIMESTAMP(col, fmt)")
self.validate_identity(
"WITH t (SELECT 1 AS c) SELECT c FROM t",
"WITH t AS (SELECT 1 AS c) SELECT c FROM t",
)
self.validate_identity(
"GET_PATH(json_data, '$id')",
"""GET_PATH(json_data, '["$id"]')""",
)
self.validate_identity(
"CAST(x AS GEOGRAPHY)",
"TO_GEOGRAPHY(x)",
@ -481,6 +489,7 @@ class TestSnowflake(Validator):
write={
"": "SELECT LOGICAL_OR(c1), LOGICAL_OR(c2) FROM test",
"duckdb": "SELECT BOOL_OR(c1), BOOL_OR(c2) FROM test",
"oracle": "SELECT MAX(c1), MAX(c2) FROM test",
"postgres": "SELECT BOOL_OR(c1), BOOL_OR(c2) FROM test",
"snowflake": "SELECT BOOLOR_AGG(c1), BOOLOR_AGG(c2) FROM test",
"spark": "SELECT BOOL_OR(c1), BOOL_OR(c2) FROM test",
@ -492,6 +501,7 @@ class TestSnowflake(Validator):
write={
"": "SELECT LOGICAL_AND(c1), LOGICAL_AND(c2) FROM test",
"duckdb": "SELECT BOOL_AND(c1), BOOL_AND(c2) FROM test",
"oracle": "SELECT MIN(c1), MIN(c2) FROM test",
"postgres": "SELECT BOOL_AND(c1), BOOL_AND(c2) FROM test",
"snowflake": "SELECT BOOLAND_AGG(c1), BOOLAND_AGG(c2) FROM test",
"spark": "SELECT BOOL_AND(c1), BOOL_AND(c2) FROM test",

View file

@ -263,6 +263,14 @@ TBLPROPERTIES (
self.validate_identity("TRIM(LEADING 'SL' FROM 'SSparkSQLS')")
self.validate_identity("TRIM(TRAILING 'SL' FROM 'SSparkSQLS')")
self.validate_identity("SPLIT(str, pattern, lim)")
self.validate_identity(
"SELECT 1 limit",
"SELECT 1 AS limit",
)
self.validate_identity(
"SELECT 1 offset",
"SELECT 1 AS offset",
)
self.validate_identity(
"SELECT UNIX_TIMESTAMP()",
"SELECT UNIX_TIMESTAMP(CURRENT_TIMESTAMP())",
@ -918,3 +926,15 @@ TBLPROPERTIES (
with self.subTest(f"Testing STRING() for {dialect}"):
query = parse_one("STRING(a)", dialect=dialect)
self.assertEqual(query.sql(dialect), "CAST(a AS STRING)")
def test_analyze(self):
self.validate_identity("ANALYZE TABLE tbl COMPUTE STATISTICS NOSCAN")
self.validate_identity("ANALYZE TABLE tbl COMPUTE STATISTICS FOR ALL COLUMNS")
self.validate_identity("ANALYZE TABLE tbl COMPUTE STATISTICS FOR COLUMNS foo, bar")
self.validate_identity("ANALYZE TABLE ctlg.db.tbl COMPUTE STATISTICS NOSCAN")
self.validate_identity("ANALYZE TABLES COMPUTE STATISTICS NOSCAN")
self.validate_identity("ANALYZE TABLES FROM db COMPUTE STATISTICS")
self.validate_identity("ANALYZE TABLES IN db COMPUTE STATISTICS")
self.validate_identity(
"ANALYZE TABLE ctlg.db.tbl PARTITION(foo = 'foo', bar = 'bar') COMPUTE STATISTICS NOSCAN"
)

View file

@ -237,3 +237,7 @@ class TestSQLite(Validator):
self.validate_identity(
"CREATE TABLE store (store_id INTEGER PRIMARY KEY AUTOINCREMENT, mgr_id INTEGER NOT NULL UNIQUE REFERENCES staff ON UPDATE CASCADE)"
)
def test_analyze(self):
self.validate_identity("ANALYZE tbl")
self.validate_identity("ANALYZE schma.tbl")

View file

@ -126,3 +126,24 @@ class TestStarrocks(Validator):
"spark": "SELECT id, t.col FROM tbl LATERAL VIEW EXPLODE(scores) t AS col",
},
)
def test_analyze(self):
self.validate_identity("ANALYZE TABLE TBL(c1, c2) PROPERTIES ('prop1'=val1)")
self.validate_identity("ANALYZE FULL TABLE TBL(c1, c2) PROPERTIES ('prop1'=val1)")
self.validate_identity("ANALYZE SAMPLE TABLE TBL(c1, c2) PROPERTIES ('prop1'=val1)")
self.validate_identity("ANALYZE TABLE TBL(c1, c2) WITH SYNC MODE PROPERTIES ('prop1'=val1)")
self.validate_identity(
"ANALYZE TABLE TBL(c1, c2) WITH ASYNC MODE PROPERTIES ('prop1'=val1)"
)
self.validate_identity(
"ANALYZE TABLE TBL UPDATE HISTOGRAM ON c1, c2 PROPERTIES ('prop1'=val1)"
)
self.validate_identity(
"ANALYZE TABLE TBL UPDATE HISTOGRAM ON c1, c2 WITH 5 BUCKETS PROPERTIES ('prop1'=val1)"
)
self.validate_identity(
"ANALYZE TABLE TBL UPDATE HISTOGRAM ON c1, c2 WITH SYNC MODE WITH 5 BUCKETS PROPERTIES ('prop1'=val1)"
)
self.validate_identity(
"ANALYZE TABLE TBL UPDATE HISTOGRAM ON c1, c2 WITH ASYNC MODE WITH 5 BUCKETS PROPERTIES ('prop1'=val1)"
)

View file

@ -78,3 +78,7 @@ class TestTrino(Validator):
self.validate_identity(
"ALTER VIEW people SET AUTHORIZATION alice", check_command_warning=True
)
def test_analyze(self):
self.validate_identity("ANALYZE tbl")
self.validate_identity("ANALYZE tbl WITH (prop1=val1, prop2=val2)")

View file

@ -883,3 +883,5 @@ GRANT SELECT ON nation TO alice WITH GRANT OPTION
GRANT DELETE ON SCHEMA finance TO bob
SELECT attach
SELECT detach
SELECT 1 OFFSET 1
SELECT 1 LIMIT 1

View file

@ -277,6 +277,10 @@ SELECT x.a AS a FROM x AS x UNION SELECT x.a AS a FROM x AS x UNION SELECT x.a A
SELECT a FROM (SELECT a FROM x UNION SELECT a FROM x) ORDER BY a;
SELECT _q_0.a AS a FROM (SELECT x.a AS a FROM x AS x UNION SELECT x.a AS a FROM x AS x) AS _q_0 ORDER BY a;
# title: nested subqueries in union
((select a from x where a < 1)) UNION ((select a from x where a > 2));
((SELECT x.a AS a FROM x AS x WHERE x.a < 1)) UNION ((SELECT x.a AS a FROM x AS x WHERE x.a > 2));
--------------------------------------
-- Subqueries
--------------------------------------

View file

@ -822,6 +822,22 @@ class TestBuild(unittest.TestCase):
lambda: exp.union("SELECT 1", "SELECT 2", "SELECT 3", "SELECT 4"),
"SELECT 1 UNION SELECT 2 UNION SELECT 3 UNION SELECT 4",
),
(
lambda: select("x")
.with_("var1", as_=select("x").from_("tbl2").subquery(), scalar=True)
.from_("tbl")
.where("x > var1"),
"WITH (SELECT x FROM tbl2) AS var1 SELECT x FROM tbl WHERE x > var1",
"clickhouse",
),
(
lambda: select("x")
.with_("var1", as_=select("x").from_("tbl2"), scalar=True)
.from_("tbl")
.where("x > var1"),
"WITH (SELECT x FROM tbl2) AS var1 SELECT x FROM tbl WHERE x > var1",
"clickhouse",
),
]:
with self.subTest(sql):
self.assertEqual(expression().sql(dialect[0] if dialect else None), sql)

View file

@ -880,7 +880,6 @@ FROM tbl1""",
"ALTER TABLE table1 RENAME COLUMN c1 c2",
"ALTER TYPE electronic_mail RENAME TO email",
"ALTER schema doo",
"ANALYZE a.y",
"CALL catalog.system.iceberg_procedure_name(named_arg_1 => 'arg_1', named_arg_2 => 'arg_2')",
"COMMENT ON ACCESS METHOD gin IS 'GIN index access method'",
"CREATE OR REPLACE STAGE",