import org.apache.spark.sql.SparkSession
val spark: SparkSession = SparkSession.builder
.appName("My Spark Application") // optional and will be autogenerated if not specified
.master("local[*]") // avoid hardcoding the deployment environment
.enableHiveSupport() // self-explanatory, isn't it?
.getOrCreate
Builder — Building SparkSession using Fluent API
Builder is the fluent API to build a fully-configured SparkSession.
| Method | Description |
|---|---|
Gets the current SparkSession or creates a new one. |
|
Enables Hive support |
You can use the fluent design pattern to set the various properties of a SparkSession that opens a session to Spark SQL.
|
Note
|
You can have multiple SparkSessions in a single Spark application for different data catalogs (through relational entities).
|
getOrCreate Method
|
Caution
|
FIXME |
config Method
|
Caution
|
FIXME |
Enabling Hive Support — enableHiveSupport Method
When creating a SparkSession, you can optionally enable Hive support using enableHiveSupport method.
enableHiveSupport(): Builder
enableHiveSupport enables Hive support (with connectivity to a persistent Hive metastore, support for Hive serdes, and Hive user-defined functions).
|
Note
|
You do not need any existing Hive installation to use Spark’s Hive support. Refer to SharedState. |
Internally, enableHiveSupport makes sure that the Hive classes are on CLASSPATH, i.e. Spark SQL’s org.apache.hadoop.hive.conf.HiveConf, and sets spark.sql.catalogImplementation property to hive.