import org.apache.spark.sql.SparkSession
val spark: SparkSession = ...
scala> spark.conf
res0: org.apache.spark.sql.RuntimeConfig = org.apache.spark.sql.RuntimeConfig@6b58a0f9
SQLConf
SQLConf
is an internal key-value configuration store for parameters and hints used in Spark SQL.
Note
|
|
SQLConf
offers methods to get, set, unset or clear their values, but has also the accessor methods to read the current value of a parameter or hint.
You can access a session-specific SQLConf
using SessionState:
import org.apache.spark.sql.SparkSession
val spark: SparkSession = ...
import spark.sessionState.conf
// accessing properties through accessor methods
scala> conf.numShufflePartitions
res0: Int = 200
// setting properties using aliases
import org.apache.spark.sql.internal.SQLConf.SHUFFLE_PARTITIONS
conf.setConf(SHUFFLE_PARTITIONS, 2)
scala> conf.numShufflePartitions
res2: Int = 2
// unset aka reset properties to the default value
conf.unsetConf(SHUFFLE_PARTITIONS)
scala> conf.numShufflePartitions
res4: Int = 200
Name | Parameter / Hint | Description |
---|---|---|
Used exclusively for |
||
Used exclusively in JoinSelection execution planning strategy |
||
Used exclusively in BroadcastExchangeExec (for broadcasting a table to executors). |
||
Used…FIXME |
||
Used exclusively in pivot operator. |
||
Used exclusively in RelationalGroupedDataset when creating the result |
||
Used…FIXME |
||
Used in:
|
||
Used exclusively in CostBasedJoinReorder logical plan optimization |
||
Used exclusively when a physical operator is requested the first n rows as an array. |
||
Used exclusively in JoinSelection execution planning strategy to prefer sort merge join over shuffle hash join. |
||
Used exclusively in ReorderJoin logical plan optimization (and indirectly in |
||
Used…FIXME |
||
Used in:
|
||
Used exclusively when |
||
Used in:
|
||
Used exclusively when |
||
Used exclusively in |
Name | Default Value | Description | ||
---|---|---|---|---|
|
Enables adaptive query execution
Use adaptiveExecutionEnabled method to access the current value. |
|||
|
Maximum size (in bytes) for a table that will be broadcast to all worker nodes when performing a join. If the size of the statistics of the logical plan of a table is at most the setting, the DataFrame is broadcast for join. Negative values or Use autoBroadcastJoinThreshold method to access the current value. |
|||
|
Timeout in seconds for the broadcast wait time in broadcast joins. When negative, it is assumed infinite (i.e. Used through SQLConf.broadcastTimeout. |
|||
|
Enables cost-based optimization (CBO) for estimation of plan statistics when enabled (i.e. Used (through
|
|||
|
Enables join reorder for cost-based optimization (CBO). Use joinReorderEnabled method to access the current value. |
|||
|
Enables join reordering based on star schema detection for cost-based optimization (CBO) in ReorderJoin logical plan optimization. Use starSchemaDetection method to access the current value. |
|||
|
(internal) Whether the whole stage codegen could be temporary disabled for the part of a query that has failed to compile generated code ( Use wholeStageFallback method to access the current value. |
|||
|
(internal) Maximum number of output fields (including nested fields) that whole-stage codegen supports. Going above the number deactivates whole-stage codegen. Use wholeStageMaxNumFields method to access the current value. |
|||
|
(internal) Whether the whole stage (of multiple physical operators) will be compiled into a single Java method ( Use wholeStageEnabled method to access the current value. |
|||
Java’s |
(internal) Table size used in query planning. It is by default set to Java’s Use useObjectHashAggregation method to access the current value. |
|||
|
Flag to enable ObjectHashAggregateExec in Aggregation execution planning strategy. Use useObjectHashAggregation method to access the current value. |
|||
|
(internal) Controls…FIXME Use columnBatchSize method to access the current value. |
|||
|
(internal) Controls…FIXME Use useCompression method to access the current value. |
|||
|
(internal) Controls JoinSelection execution planning strategy to prefer sort merge join over shuffle hash join. Use preferSortMergeJoin method to access the current value. |
|||
|
(internal) Minimal increase rate in the number of partitions between attempts when executing Use limitScaleUpFactor method to access the current value. |
|||
|
||||
|
Maximum number of (distinct) values that will be collected without error (when doing a pivot without specifying the values for the pivot column) Use dataFramePivotMaxValues method to access the current value. |
|||
|
Controls whether to retain columns used for aggregation or not (in RelationalGroupedDataset operators). Use dataFrameRetainGroupColumns method to access the current value. |
|||
|
Control whether to resolve ambiguity in join conditions for self-joins automatically. |
|||
|
Default number of partitions to use when shuffling data for joins or aggregations. Corresponds to Apache Hive’s mapred.reduce.tasks property that Spark considers deprecated. Use numShufflePartitions method to access the current value. |
|||
|
Controls whether to delete the expired log files in file stream sink. |
|||
|
(internal) Threshold for number of rows buffered in window operator Use windowExecBufferSpillThreshold method to access the current value. |
Note
|
SQLConf is a private[sql] serializable class in org.apache.spark.sql.internal package.
|
Getting Parameters and Hints
You can get the current parameters and hints using the following family of get
methods.
getConfString(key: String): String
getConf[T](entry: ConfigEntry[T], defaultValue: T): T
getConf[T](entry: ConfigEntry[T]): T
getConf[T](entry: OptionalConfigEntry[T]): Option[T]
getConfString(key: String, defaultValue: String): String
getAllConfs: immutable.Map[String, String]
getAllDefinedConfs: Seq[(String, String, String)]
Setting Parameters and Hints
You can set parameters and hints using the following family of set
methods.
setConf(props: Properties): Unit
setConfString(key: String, value: String): Unit
setConf[T](entry: ConfigEntry[T], value: T): Unit