Edit on GitHub

Dialects

While there is a SQL standard, most SQL engines support a variation of that standard. This makes it difficult to write portable SQL code. SQLGlot bridges all the different variations, called "dialects", with an extensible SQL transpilation framework.

The base sqlglot.dialects.dialect.Dialect class implements a generic dialect that aims to be as universal as possible.

Each SQL variation has its own Dialect subclass, extending the corresponding Tokenizer, Parser and Generator classes as needed.

Implementing a custom Dialect

Creating a new SQL dialect may seem complicated at first, but it is actually quite simple in SQLGlot:

from sqlglot import exp
from sqlglot.dialects.dialect import Dialect
from sqlglot.generator import Generator
from sqlglot.tokens import Tokenizer, TokenType


class Custom(Dialect):
    class Tokenizer(Tokenizer):
        QUOTES = ["'", '"']  # Strings can be delimited by either single or double quotes
        IDENTIFIERS = ["`"]  # Identifiers can be delimited by backticks

        # Associates certain meaningful words with tokens that capture their intent
        KEYWORDS = {
            **Tokenizer.KEYWORDS,
            "INT64": TokenType.BIGINT,
            "FLOAT64": TokenType.DOUBLE,
        }

    class Generator(Generator):
        # Specifies how AST nodes, i.e. subclasses of exp.Expression, should be converted into SQL
        TRANSFORMS = {
            exp.Array: lambda self, e: f"[{self.expressions(e)}]",
        }

        # Specifies how AST nodes representing data types should be converted into SQL
        TYPE_MAPPING = {
            exp.DataType.Type.TINYINT: "INT64",
            exp.DataType.Type.SMALLINT: "INT64",
            exp.DataType.Type.INT: "INT64",
            exp.DataType.Type.BIGINT: "INT64",
            exp.DataType.Type.DECIMAL: "NUMERIC",
            exp.DataType.Type.FLOAT: "FLOAT64",
            exp.DataType.Type.DOUBLE: "FLOAT64",
            exp.DataType.Type.BOOLEAN: "BOOL",
            exp.DataType.Type.TEXT: "STRING",
        }

The above example demonstrates how certain parts of the base Dialect class can be overridden to match a different specification. Even though it is a fairly realistic starting point, we strongly encourage the reader to study existing dialect implementations in order to understand how their various components can be modified, depending on the use-case.


 1"""
 2## Dialects
 3
 4While there is a SQL standard, most SQL engines support a variation of that standard. This makes it difficult
 5to write portable SQL code. SQLGlot bridges all the different variations, called "dialects", with an extensible
 6SQL transpilation framework. 
 7
 8The base `sqlglot.dialects.dialect.Dialect` class implements a generic dialect that aims to be as universal as possible.
 9
10Each SQL variation has its own `Dialect` subclass, extending the corresponding `Tokenizer`, `Parser` and `Generator`
11classes as needed.
12
13### Implementing a custom Dialect
14
15Creating a new SQL dialect may seem complicated at first, but it is actually quite simple in SQLGlot:
16
17```python
18from sqlglot import exp
19from sqlglot.dialects.dialect import Dialect
20from sqlglot.generator import Generator
21from sqlglot.tokens import Tokenizer, TokenType
22
23
24class Custom(Dialect):
25    class Tokenizer(Tokenizer):
26        QUOTES = ["'", '"']  # Strings can be delimited by either single or double quotes
27        IDENTIFIERS = ["`"]  # Identifiers can be delimited by backticks
28
29        # Associates certain meaningful words with tokens that capture their intent
30        KEYWORDS = {
31            **Tokenizer.KEYWORDS,
32            "INT64": TokenType.BIGINT,
33            "FLOAT64": TokenType.DOUBLE,
34        }
35
36    class Generator(Generator):
37        # Specifies how AST nodes, i.e. subclasses of exp.Expression, should be converted into SQL
38        TRANSFORMS = {
39            exp.Array: lambda self, e: f"[{self.expressions(e)}]",
40        }
41
42        # Specifies how AST nodes representing data types should be converted into SQL
43        TYPE_MAPPING = {
44            exp.DataType.Type.TINYINT: "INT64",
45            exp.DataType.Type.SMALLINT: "INT64",
46            exp.DataType.Type.INT: "INT64",
47            exp.DataType.Type.BIGINT: "INT64",
48            exp.DataType.Type.DECIMAL: "NUMERIC",
49            exp.DataType.Type.FLOAT: "FLOAT64",
50            exp.DataType.Type.DOUBLE: "FLOAT64",
51            exp.DataType.Type.BOOLEAN: "BOOL",
52            exp.DataType.Type.TEXT: "STRING",
53        }
54```
55
56The above example demonstrates how certain parts of the base `Dialect` class can be overridden to match a different
57specification. Even though it is a fairly realistic starting point, we strongly encourage the reader to study existing
58dialect implementations in order to understand how their various components can be modified, depending on the use-case.
59
60----
61"""
62
63from sqlglot.dialects.bigquery import BigQuery
64from sqlglot.dialects.clickhouse import ClickHouse
65from sqlglot.dialects.databricks import Databricks
66from sqlglot.dialects.dialect import Dialect, Dialects
67from sqlglot.dialects.doris import Doris
68from sqlglot.dialects.drill import Drill
69from sqlglot.dialects.duckdb import DuckDB
70from sqlglot.dialects.hive import Hive
71from sqlglot.dialects.mysql import MySQL
72from sqlglot.dialects.oracle import Oracle
73from sqlglot.dialects.postgres import Postgres
74from sqlglot.dialects.presto import Presto
75from sqlglot.dialects.redshift import Redshift
76from sqlglot.dialects.snowflake import Snowflake
77from sqlglot.dialects.spark import Spark
78from sqlglot.dialects.spark2 import Spark2
79from sqlglot.dialects.sqlite import SQLite
80from sqlglot.dialects.starrocks import StarRocks
81from sqlglot.dialects.tableau import Tableau
82from sqlglot.dialects.teradata import Teradata
83from sqlglot.dialects.trino import Trino
84from sqlglot.dialects.tsql import TSQL