Dialects
While there is a SQL standard, most SQL engines support a variation of that standard. This makes it difficult to write portable SQL code. SQLGlot bridges all the different variations, called "dialects", with an extensible SQL transpilation framework.
The base sqlglot.dialects.dialect.Dialect
class implements a generic dialect that aims to be as universal as possible.
Each SQL variation has its own Dialect
subclass, extending the corresponding Tokenizer
, Parser
and Generator
classes as needed.
Implementing a custom Dialect
Creating a new SQL dialect may seem complicated at first, but it is actually quite simple in SQLGlot:
from sqlglot import exp
from sqlglot.dialects.dialect import Dialect
from sqlglot.generator import Generator
from sqlglot.tokens import Tokenizer, TokenType
class Custom(Dialect):
class Tokenizer(Tokenizer):
QUOTES = ["'", '"'] # Strings can be delimited by either single or double quotes
IDENTIFIERS = ["`"] # Identifiers can be delimited by backticks
# Associates certain meaningful words with tokens that capture their intent
KEYWORDS = {
**Tokenizer.KEYWORDS,
"INT64": TokenType.BIGINT,
"FLOAT64": TokenType.DOUBLE,
}
class Generator(Generator):
# Specifies how AST nodes, i.e. subclasses of exp.Expression, should be converted into SQL
TRANSFORMS = {
exp.Array: lambda self, e: f"[{self.expressions(e)}]",
}
# Specifies how AST nodes representing data types should be converted into SQL
TYPE_MAPPING = {
exp.DataType.Type.TINYINT: "INT64",
exp.DataType.Type.SMALLINT: "INT64",
exp.DataType.Type.INT: "INT64",
exp.DataType.Type.BIGINT: "INT64",
exp.DataType.Type.DECIMAL: "NUMERIC",
exp.DataType.Type.FLOAT: "FLOAT64",
exp.DataType.Type.DOUBLE: "FLOAT64",
exp.DataType.Type.BOOLEAN: "BOOL",
exp.DataType.Type.TEXT: "STRING",
}
The above example demonstrates how certain parts of the base Dialect
class can be overridden to match a different
specification. Even though it is a fairly realistic starting point, we strongly encourage the reader to study existing
dialect implementations in order to understand how their various components can be modified, depending on the use-case.
1""" 2## Dialects 3 4While there is a SQL standard, most SQL engines support a variation of that standard. This makes it difficult 5to write portable SQL code. SQLGlot bridges all the different variations, called "dialects", with an extensible 6SQL transpilation framework. 7 8The base `sqlglot.dialects.dialect.Dialect` class implements a generic dialect that aims to be as universal as possible. 9 10Each SQL variation has its own `Dialect` subclass, extending the corresponding `Tokenizer`, `Parser` and `Generator` 11classes as needed. 12 13### Implementing a custom Dialect 14 15Creating a new SQL dialect may seem complicated at first, but it is actually quite simple in SQLGlot: 16 17```python 18from sqlglot import exp 19from sqlglot.dialects.dialect import Dialect 20from sqlglot.generator import Generator 21from sqlglot.tokens import Tokenizer, TokenType 22 23 24class Custom(Dialect): 25 class Tokenizer(Tokenizer): 26 QUOTES = ["'", '"'] # Strings can be delimited by either single or double quotes 27 IDENTIFIERS = ["`"] # Identifiers can be delimited by backticks 28 29 # Associates certain meaningful words with tokens that capture their intent 30 KEYWORDS = { 31 **Tokenizer.KEYWORDS, 32 "INT64": TokenType.BIGINT, 33 "FLOAT64": TokenType.DOUBLE, 34 } 35 36 class Generator(Generator): 37 # Specifies how AST nodes, i.e. subclasses of exp.Expression, should be converted into SQL 38 TRANSFORMS = { 39 exp.Array: lambda self, e: f"[{self.expressions(e)}]", 40 } 41 42 # Specifies how AST nodes representing data types should be converted into SQL 43 TYPE_MAPPING = { 44 exp.DataType.Type.TINYINT: "INT64", 45 exp.DataType.Type.SMALLINT: "INT64", 46 exp.DataType.Type.INT: "INT64", 47 exp.DataType.Type.BIGINT: "INT64", 48 exp.DataType.Type.DECIMAL: "NUMERIC", 49 exp.DataType.Type.FLOAT: "FLOAT64", 50 exp.DataType.Type.DOUBLE: "FLOAT64", 51 exp.DataType.Type.BOOLEAN: "BOOL", 52 exp.DataType.Type.TEXT: "STRING", 53 } 54``` 55 56The above example demonstrates how certain parts of the base `Dialect` class can be overridden to match a different 57specification. Even though it is a fairly realistic starting point, we strongly encourage the reader to study existing 58dialect implementations in order to understand how their various components can be modified, depending on the use-case. 59 60---- 61""" 62 63from sqlglot.dialects.bigquery import BigQuery 64from sqlglot.dialects.clickhouse import ClickHouse 65from sqlglot.dialects.databricks import Databricks 66from sqlglot.dialects.dialect import Dialect, Dialects 67from sqlglot.dialects.doris import Doris 68from sqlglot.dialects.drill import Drill 69from sqlglot.dialects.duckdb import DuckDB 70from sqlglot.dialects.hive import Hive 71from sqlglot.dialects.mysql import MySQL 72from sqlglot.dialects.oracle import Oracle 73from sqlglot.dialects.postgres import Postgres 74from sqlglot.dialects.presto import Presto 75from sqlglot.dialects.redshift import Redshift 76from sqlglot.dialects.snowflake import Snowflake 77from sqlglot.dialects.spark import Spark 78from sqlglot.dialects.spark2 import Spark2 79from sqlglot.dialects.sqlite import SQLite 80from sqlglot.dialects.starrocks import StarRocks 81from sqlglot.dialects.tableau import Tableau 82from sqlglot.dialects.teradata import Teradata 83from sqlglot.dialects.trino import Trino 84from sqlglot.dialects.tsql import TSQL