import org.apache.spark.sql.types._
val schema = StructType(
StructField("id", LongType, nullable = false) ::
StructField("name", StringType, nullable = false) :: Nil)
import org.apache.spark.sql.catalyst.encoders.RowEncoder
scala> val encoder = RowEncoder(schema)
encoder: org.apache.spark.sql.catalyst.encoders.ExpressionEncoder[org.apache.spark.sql.Row] = class[id[0]: bigint, name[0]: string]
// RowEncoder is never flat
scala> encoder.flat
res0: Boolean = false
RowEncoder — Encoder for DataFrames
RowEncoder
is a part of the Encoder framework and acts as the encoder for DataFrames, i.e. Dataset[Row]
— Datasets of Rows.
Note
|
DataFrame type is a mere type alias for Dataset[Row] that expects a Encoder[Row] available in scope which is indeed RowEncoder itself.
|
RowEncoder
is an object
in Scala with apply and other factory methods.
RowEncoder
can create ExpressionEncoder[Row]
from a schema (using apply method).
RowEncoder
object belongs to org.apache.spark.sql.catalyst.encoders
package.
Creating ExpressionEncoder of Rows — apply
method
apply(schema: StructType): ExpressionEncoder[Row]
apply
builds ExpressionEncoder of Row, i.e. ExpressionEncoder[Row]
, from the input StructType (as schema
).
Internally, apply
creates a BoundReference for the Row type and returns a ExpressionEncoder[Row]
for the input schema
, a CreateNamedStruct
serializer (using serializerFor
internal method), a deserializer for the schema, and the Row
type.
serializerFor
Internal Method
serializerFor(inputObject: Expression, inputType: DataType): Expression
serializerFor
creates an Expression
that is assumed to be CreateNamedStruct
.
serializerFor
takes the input inputType
and:
-
Returns the input
inputObject
as is for native types, i.e.NullType
,BooleanType
,ByteType
,ShortType
,IntegerType
,LongType
,FloatType
,DoubleType
,BinaryType
,CalendarIntervalType
.CautionFIXME What does being native type mean? -
For
UserDefinedType
s, it takes the UDT class from theSQLUserDefinedType
annotation orUDTRegistration
object and returns an expression withInvoke
to callserialize
method on aNewInstance
of the UDT class. -
For TimestampType, it returns an expression with a StaticInvoke to call
fromJavaTimestamp
onDateTimeUtils
class. -
…FIXME
Caution
|
FIXME Describe me. |