import org.apache.spark.sql.SparkSession
val spark: SparkSession = SparkSession.builder
.master("local[*]")
.appName("My Spark Application")
.config("spark.sql.warehouse.dir", "c:/Temp") (1)
.getOrCreate
Settings
The following list are the settings used to configure Spark SQL applications.
You can set them in a SparkSession upon instantiation using config method.
-
Sets spark.sql.warehouse.dir for the Spark SQL session
Name | Default | Description | ||
---|---|---|---|---|
|
(internal) Selects the active catalog implementation from:
|
|||
|
Defines the default data source to use for DataFrameReader. Used when:
|
spark.sql.warehouse.dir
spark.sql.warehouse.dir
(default: ${system:user.dir}/spark-warehouse
) is the default location of Hive warehouse directory (using Derby) with managed databases and tables.
See also the official Hive Metastore Administration document.
spark.sql.parquet.filterPushdown
spark.sql.parquet.filterPushdown
(default: true
) is a flag to control the filter predicate push-down optimization for data sources using parquet file format.
spark.sql.allowMultipleContexts
spark.sql.allowMultipleContexts
(default: true
) controls whether creating multiple SQLContexts/HiveContexts is allowed.
spark.sql.columnNameOfCorruptRecord
spark.sql.columnNameOfCorruptRecord
…FIXME
spark.sql.dialect
spark.sql.dialect
- FIXME
spark.sql.streaming.checkpointLocation
spark.sql.streaming.checkpointLocation
is the default location for storing checkpoint data for continuously executing queries.