SparkEnv.get.serializerManager
SerializerManager
Caution
|
FIXME |
When SparkEnv
is created (either for the driver or executors), it instantiates SerializerManager
that is then used to create a BlockManager.
SerializerManager
automatically selects the "best" serializer for shuffle blocks that could either be KryoSerializer
when a RDD’s types are known to be compatible with Kryo or the default Serializer
.
The common idiom in Spark’s code is to access the current SerializerManager
using SparkEnv.
Note
|
SerializerManager was introduced in SPARK-13926.
|
Creating SerializerManager
Instance
Caution
|
FIXME |
wrapStream
Method
Caution
|
FIXME |
dataDeserializeStream
Method
Caution
|
FIXME |
Automatic Selection of Best Serializer
Caution
|
FIXME |
SerializerManager
will automatically pick a Kryo serializer for ShuffledRDDs whose key, value, and/or combiner types are primitives, arrays of primitives, or strings.
Selecting "Best" Serializer
— getSerializer
Method
getSerializer(keyClassTag: ClassTag[_], valueClassTag: ClassTag[_]): Serializer
getSerializer
selects the "best" Serializer given the input types for keys and values (in a RDD).
getSerializer
returns KryoSerializer
when the types of keys and values are compatible with Kryo or the default Serializer
.
Note
|
The default Serializer is defined when SerializerManager is created.
|
Note
|
getSerializer is used when ShuffledRDD returns the single-element dependency list (with ShuffleDependency ).
|
Settings
Name | Default value | Description |
---|---|---|
|
|
The flag to control whether to compress shuffle output when stored |
|
|
The flag to control whether to compress RDD partitions when stored serialized. |
|
|
The flag to control whether to compress shuffle output temporarily spilled to disk. |
|
||
|
The flag to enable IO encryption |