val spark: SparkSession = ...
spark.sessionState.sqlParser
AbstractSqlParser — Base SQL Parsing Infrastructure
AbstractSqlParser
is the one and only ParserInterface in Spark SQL that acts as the foundation of the SQL parsing infrastructure with two concrete implementations available (that are merely required to define their custom AstBuilder for the final transformation of SQL textual representation to their Spark SQL equivalent entities, i.e. DataType, Expression, LogicalPlan and TableIdentifier
).
AbstractSqlParser
first sets up SqlBaseLexer
and SqlBaseParser
for parsing (and pass the latter on to a parsing function) and use AstBuilder
for the actual parsing.
Name | Description |
---|---|
The default SQL parser available as |
|
Parses DataType or StructType (schema) from their canonical string representation. |
AbstractSqlParser
simply relays all the SQL parsing to translate a SQL string to that specialized AstBuilder.
AbstractSqlParser Contract
abstract class AbstractSqlParser extends ParserInterface {
def astBuilder: AstBuilder
def parse[T](command: String)(toResult: SqlBaseParser => T): T
def parseDataType(sqlText: String): DataType
def parsePlan(sqlText: String): LogicalPlan
def parseExpression(sqlText: String): Expression
def parseTableIdentifier(sqlText: String): TableIdentifier
def parseTableSchema(sqlText: String): StructType
}
Method | Description | ||
---|---|---|---|
Gives AstBuilder for the actual SQL parsing. Used in all the
|
|||
Sets up Used in all the |
|||
Used when… |
|||
Used when… |
|||
Creates a LogicalPlan for a given SQL textual statement.
When a SQL statement could not be parsed,
|
|||
Used when… |
|||
Used when… |
Setting Up SqlBaseLexer and SqlBaseParser for Parsing — parse
Method
parse[T](command: String)(toResult: SqlBaseParser => T): T
parse
sets up a proper ANTLR parsing infrastructure with SqlBaseLexer
and SqlBaseParser
(which are the ANTLR-specific classes of Spark SQL that are auto-generated at build time from the SqlBase.g4
grammar).
Tip
|
Review the definition of ANTLR grammar for Spark SQL in sql/catalyst/src/main/antlr4/org/apache/spark/sql/catalyst/parser/SqlBase.g4. |
Internally, parse
first prints out the following INFO message to the logs:
INFO SparkSqlParser: Parsing command: [command]
Tip
|
Enable INFO logging level for the custom AbstractSqlParser , i.e. SparkSqlParser or CatalystSqlParser, to see the above INFO message.
|
parse
then creates and sets up a SqlBaseLexer
and SqlBaseParser
that in turn passes the latter on to the input toResult
function where the parsing finally happens.
Note
|
parse uses SLL prediction mode for parsing first before falling back to LL mode.
|
In case of parsing errors, parse
reports a ParseException
.