RuleExecutor — Tree Transformation Rule Executor

RuleExecutor executes a collection of rules (as batches) to transform a TreeNode.

Note
Available TreeNodes are eithe logical or physical operators.

RuleExecutor defines the protected batches method that implementations are supposed to define with the collection of Batch instances to execute.

protected def batches: Seq[Batch]

Applying Rules to Tree — execute Method

execute(plan: TreeType): TreeType

execute iterates over batches and applies rules sequentially to the input plan.

It tracks the number of iterations and the time of executing each rule (with a plan).

When a rule changes a plan, you should see the following TRACE message in the logs:

TRACE HiveSessionStateBuilder$$anon$1:
=== Applying Rule [ruleName] ===
[currentAndModifiedPlansSideBySide]

After the number of iterations has reached the number of iterations for the batch’s Strategy it stops execution and prints out the following WARN message to the logs:

WARN HiveSessionStateBuilder$$anon$1: Max iterations ([iteration]) reached for batch [batchName]

When the plan has not changed (after applying rules), you should see the following TRACE message in the logs and execute moves on to applying the rules in the next batch. The moment is called fixed point (i.e. when the execution converges).

TRACE HiveSessionStateBuilder$$anon$1: Fixed point reached for batch [batchName] after [iteration] iterations.

After the batch finishes, if the plan has been changed by the rules, you should see the following DEBUG message in the logs:

DEBUG HiveSessionStateBuilder$$anon$1:
=== Result of Batch [batchName] ===
[currentAndModifiedPlansSideBySide]

Otherwise, when the rules had no changes to a plan, you should see the following TRACE message in the logs:

TRACE HiveSessionStateBuilder$$anon$1: Batch [batchName] has no effect.

Batch — Collection of Rules

Batch in Catalyst is a named collection of optimization rules with a strategy, e.g.

Batch("Substitution", fixedPoint,
  CTESubstitution,
  WindowsSubstitution,
  EliminateUnions,
  new SubstituteUnresolvedOrdinals(conf)),

A Strategy can be Once or FixedPoint (with a number of iterations).

Note
Once strategy is a FixedPoint strategy with one iteration.

Rule

A rule in Catalyst is a named transformation that can be applied to a plan tree.

Rule abstract class defines ruleName attribute and a single method apply:

apply(plan: TreeType): TreeType
Note
TreeType is the type of a (plan) tree that a Rule works with, e.g. LogicalPlan, SparkPlan or Expression.

results matching ""

    No results matching ""