ExplainCommand Logical Command

ExplainCommand is a logical command with side effect that allows users to see how a structured query is structured and will eventually be executed, i.e. shows logical and physical plans with or without details about codegen and cost.

When executed, ExplainCommand computes a QueryExecution that is then used to output a single-column DataFrame with the following:

codegen explain, i.e. WholeStageCodegen subtrees if codegen flag is enabled.
extended explain, i.e. the parsed, analyzed, optimized logical plans with the physical plan if extended flag is enabled.
cost explain, i.e. optimized logical plan with stats if cost flag is enabled.
simple explain, i.e. the physical plan only when no codegen and extended flags are enabled.

ExplainCommand is created by Dataset’s explain operator and EXPLAIN SQL statement (accepting EXTENDED and CODEGEN options).

// Explain in SQL

scala> sql("EXPLAIN EXTENDED show tables").show(truncate = false)
+-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
|plan                                                                                                                                                                                                                                           |
+-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
|== Parsed Logical Plan ==
ShowTablesCommand

== Analyzed Logical Plan ==
tableName: string, isTemporary: boolean
ShowTablesCommand

== Optimized Logical Plan ==
ShowTablesCommand

== Physical Plan ==
ExecutedCommand
   +- ShowTablesCommand|
+-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+

The following EXPLAIN variants in SQL queries are not supported:

EXPLAIN FORMATTED
EXPLAIN LOGICAL

scala> sql("EXPLAIN LOGICAL show tables")
org.apache.spark.sql.catalyst.parser.ParseException:
Operation not allowed: EXPLAIN LOGICAL(line 1, pos 0)

== SQL ==
EXPLAIN LOGICAL show tables
^^^
...

`codegenString` Attribute

Caution

FIXME

`output` Attribute

Caution

FIXME

Creating ExplainCommand Instance

ExplainCommand takes the following when created:

LogicalPlan
extended flag whether to include extended details in the output when ExplainCommand is executed (disabled by default)
codegen flag whether to include codegen details in the output when ExplainCommand is executed (disabled by default)
cost flag whether to include code in the output when ExplainCommand is executed (disabled by default)

ExplainCommand initializes output attribute.

Note	`ExplainCommand` is created when…FIXME

Computing Text Representation of QueryExecution (as Single Row) — `run` Method

run(sparkSession: SparkSession): Seq[Row]

run computes QueryExecution and returns its text representation in a single Row.

Note	`run` is a part of RunnableCommand Contract to execute commands.

Internally, run creates a IncrementalExecution for a streaming dataset directly or requests SessionState to execute the LogicalPlan.

Note	Streaming Dataset is a part of Spark Structured Streaming.

run then requests QueryExecution to build the output text representation, i.e. codegened, extended (with logical and physical plans), with stats, or simple.

In the end, run creates a Row with the text representation.

ExplainCommand Logical Command