debug()
debugCodegen()
Debugging Query Execution
debug
package object contains tools for debugging query execution that you can use to do the full analysis of your structured queries (i.e. Datasets
).
Note
|
Let’s make it clear — they are methods, my dear. |
The methods are in org.apache.spark.sql.execution.debug
package and work on your Datasets
and SparkSession.
Caution
|
FIXME Expand on the SparkSession part.
|
Import the package and do the full analysis using debug or debugCodegen methods.
debug
Method
import org.apache.spark.sql.execution.debug._
scala> spark.range(10).where('id === 4).debug
Results returned: 1
== WholeStageCodegen ==
Tuples output: 1
id LongType: {java.lang.Long}
== Filter (id#25L = 4) ==
Tuples output: 0
id LongType: {}
== Range (0, 10, splits=8) ==
Tuples output: 0
id LongType: {}
"Debugging" Codegen — debugCodegen
Method
You use debugCodegen
method to review the CodegenSupport-generated code.
import org.apache.spark.sql.execution.debug._
scala> spark.range(10).where('id === 4).debugCodegen
Found 1 WholeStageCodegen subtrees.
== Subtree 1 / 1 ==
*Filter (id#29L = 4)
+- *Range (0, 10, splits=8)
Generated code:
/* 001 */ public Object generate(Object[] references) {
/* 002 */ return new GeneratedIterator(references);
/* 003 */ }
/* 004 */
/* 005 */ final class GeneratedIterator extends org.apache.spark.sql.execution.BufferedRowIterator {
/* 006 */ private Object[] references;
...
Note
|
|