sparkSession.sessionState.optimizer
SparkOptimizer — Logical Query Optimizer
SparkOptimizer
is the one and only custom logical query plan optimizer in Spark SQL that comes with the additional logical plan optimizations.
Note
|
You can extend the available logical plan optimizations and register yours using ExperimentalMethods. |
SparkOptimizer
is available as optimizer attribute of SessionState
.
Note
|
The result of applying the batches of Optimized logical plan of a structured query is available as optimizedPlan attribute of
|
Batch Name | Strategy | Rules | Description |
---|---|---|---|
Optimize Metadata Only Query |
|
OptimizeMetadataOnlyQuery |
|
Extract Python UDF from Aggregate |
|
ExtractPythonUDFFromAggregate |
|
Prune File Source Table Partitions |
|
PruneFileSourcePartitions |
|
Tip
|
Enable Add the following line to
Refer to Logging. |
Creating SparkOptimizer Instance
SparkOptimizer
takes the following when created:
Note
|
SparkOptimizer is created when SessionState is created (that initializes optimizer property).
|
Further reading or watching
-
(video) Modern Spark DataFrame and Dataset (Intermediate Tutorial) by Adam Breindel from Databricks.