val spark: SparkSession = ...
spark.sessionState.optimizer
Optimizer — Base for Logical Query Plan Optimizers
Optimizer is the base rule-based logical query plan optimizer in Spark SQL that uses Catalyst Framework to optimize logical query plans using optimization rules.
| 
 Note 
 | 
SparkOptimizer is the one and only custom Optimizer.
 | 
Optimizer is available as optimizer of a SessionState.
Optimizer is a RuleExecutor that defines collection of logical plan optimization rules.
| Batch Name | Strategy | Rules | Description | 
|---|---|---|---|
Finish Analysis  | 
  | 
EliminateSubqueryAliases  | 
|
EliminateView  | 
|||
ReplaceExpressions  | 
|||
ReplaceDeduplicateWithAggregate  | 
|||
Union  | 
  | 
CombineUnions  | 
|
Subquery  | 
  | 
OptimizeSubqueries  | 
|
ReplaceIntersectWithSemiJoin  | 
|||
ReplaceExceptWithAntiJoin  | 
|||
ReplaceDistinctWithAggregate  | 
|||
RemoveLiteralFromGroupExpressions  | 
|||
RemoveRepetitionFromGroupExpressions  | 
|||
PushProjectionThroughUnion  | 
|||
EliminateOuterJoin  | 
|||
PushPredicateThroughJoin  | 
|||
InferFiltersFromConstraints  | 
|||
Collapses Repartition and RepartitionByExpression  | 
|||
CollapseProject  | 
|||
CollapseWindow  | 
|||
CombineFilters  | 
|||
CombineLimits  | 
|||
CombineUnions  | 
|||
OptimizeIn  | 
|||
ReorderAssociativeOperator  | 
|||
LikeSimplification  | 
|||
BooleanSimplification  | 
|||
SimplifyConditionals  | 
|||
RemoveDispensableExpressions  | 
|||
SimplifyBinaryComparison  | 
|||
PruneFilters  | 
|||
EliminateSorts  | 
|||
SimplifyCaseConversionExpressions  | 
|||
RewriteCorrelatedScalarSubquery  | 
|||
RemoveRedundantAliases  | 
|||
RemoveRedundantProject  | 
|||
SimplifyCreateStructOps  | 
|||
SimplifyCreateArrayOps  | 
|||
SimplifyCreateMapOps  | 
|||
Check Cartesian Products  | 
  | 
CheckCartesianProducts  | 
|
Join Reorder  | 
  | 
||
ConvertToLocalRelation  | 
|||
OptimizeCodegen  | 
  | 
OptimizeCodegen  | 
|
RewriteSubquery  | 
  | 
RewritePredicateSubquery  | 
|
CollapseProject  | 
| 
 Tip 
 | 
Consult the sources of Optimizer for the up-to-date list of the optimization rules.
 | 
| 
 Note 
 | 
Catalyst is a Spark SQL framework for manipulating trees. It can work with trees of relational operators and expressions in logical plans before they end up as physical execution plans. | 
scala> sql("select 1 + 1 + 1").explain(true)
== Parsed Logical Plan ==
'Project [unresolvedalias(((1 + 1) + 1), None)]
+- OneRowRelation$
== Analyzed Logical Plan ==
((1 + 1) + 1): int
Project [((1 + 1) + 1) AS ((1 + 1) + 1)#4]
+- OneRowRelation$
== Optimized Logical Plan ==
Project [3 AS ((1 + 1) + 1)#4]
+- OneRowRelation$
== Physical Plan ==
*Project [3 AS ((1 + 1) + 1)#4]
+- Scan OneRowRelation[]
| Name | Initial Value | Description | 
|---|---|---|
  | 
Used in Replace Operators, Aggregate, Operator Optimizations, Decimal Optimizations, Typed Filter Optimization and LocalRelation batches (and also indirectly in the User Provided Optimizers rule batch in SparkOptimizer).  | 
Creating Optimizer Instance
Optimizer takes the following when created:
Optimizer initializes the internal properties.