import org.apache.spark.sql.types._
val schema = StructType(
StructField("id", LongType, nullable = false) ::
StructField("name", StringType, nullable = false) :: Nil)
import org.apache.spark.sql.catalyst.encoders.RowEncoder
scala> val encoder = RowEncoder(schema)
encoder: org.apache.spark.sql.catalyst.encoders.ExpressionEncoder[org.apache.spark.sql.Row] = class[id[0]: bigint, name[0]: string]
// RowEncoder is never flat
scala> encoder.flat
res0: Boolean = false
RowEncoder — Encoder for DataFrames
RowEncoder is a part of the Encoder framework and acts as the encoder for DataFrames, i.e. Dataset[Row] — Datasets of Rows.
|
Note
|
DataFrame type is a mere type alias for Dataset[Row] that expects a Encoder[Row] available in scope which is indeed RowEncoder itself.
|
RowEncoder is an object in Scala with apply and other factory methods.
RowEncoder can create ExpressionEncoder[Row] from a schema (using apply method).
RowEncoder object belongs to org.apache.spark.sql.catalyst.encoders package.
Creating ExpressionEncoder of Rows — apply method
apply(schema: StructType): ExpressionEncoder[Row]
apply builds ExpressionEncoder of Row, i.e. ExpressionEncoder[Row], from the input StructType (as schema).
Internally, apply creates a BoundReference for the Row type and returns a ExpressionEncoder[Row] for the input schema, a CreateNamedStruct serializer (using serializerFor internal method), a deserializer for the schema, and the Row type.
serializerFor Internal Method
serializerFor(inputObject: Expression, inputType: DataType): Expression
serializerFor creates an Expression that is assumed to be CreateNamedStruct.
serializerFor takes the input inputType and:
-
Returns the input
inputObjectas is for native types, i.e.NullType,BooleanType,ByteType,ShortType,IntegerType,LongType,FloatType,DoubleType,BinaryType,CalendarIntervalType.CautionFIXME What does being native type mean? -
For
UserDefinedTypes, it takes the UDT class from theSQLUserDefinedTypeannotation orUDTRegistrationobject and returns an expression withInvoketo callserializemethod on aNewInstanceof the UDT class. -
For TimestampType, it returns an expression with a StaticInvoke to call
fromJavaTimestamponDateTimeUtilsclass. -
…FIXME
|
Caution
|
FIXME Describe me. |