import org.apache.spark.sql.SparkSession
val spark: SparkSession = SparkSession.builder
.appName("My Spark Application") // optional and will be autogenerated if not specified
.master("local[*]") // avoid hardcoding the deployment environment
.enableHiveSupport() // self-explanatory, isn't it?
.getOrCreate
Builder — Building SparkSession using Fluent API
Builder
is the fluent API to build a fully-configured SparkSession.
Method | Description |
---|---|
Gets the current SparkSession or creates a new one. |
|
Enables Hive support |
You can use the fluent design pattern to set the various properties of a SparkSession
that opens a session to Spark SQL.
Note
|
You can have multiple SparkSession s in a single Spark application for different data catalogs (through relational entities).
|
getOrCreate
Method
Caution
|
FIXME |
config
Method
Caution
|
FIXME |
Enabling Hive Support — enableHiveSupport
Method
When creating a SparkSession
, you can optionally enable Hive support using enableHiveSupport
method.
enableHiveSupport(): Builder
enableHiveSupport
enables Hive support (with connectivity to a persistent Hive metastore, support for Hive serdes, and Hive user-defined functions).
Note
|
You do not need any existing Hive installation to use Spark’s Hive support. Refer to SharedState. |
Internally, enableHiveSupport
makes sure that the Hive classes are on CLASSPATH, i.e. Spark SQL’s org.apache.hadoop.hive.conf.HiveConf
, and sets spark.sql.catalogImplementation property to hive
.