import org.apache.spark.sql.types.StringType
scala> StringType.json
res0: String = "string"
scala> StringType.sql
res1: String = STRING
scala> StringType.catalogString
res2: String = string
Data Types
DataType abstract class is the base type of all built-in data types in Spark SQL, e.g. strings, longs.
| Type Family | Data Type | Scala Types | 
|---|---|---|
(except fractional and integral types)  | 
  | 
|
  | 
||
  | 
||
  | 
||
  | 
||
  | 
||
  | 
||
  | 
||
  | 
||
  | 
||
  | 
||
  | 
||
  | 
||
  | 
Matches any concrete data type  | 
| 
 Caution 
 | 
FIXME What about AbstractDataType? | 
You can extend the type system and create your own user-defined types (UDTs).
The DataType Contract defines methods to build SQL, JSON and string representations.
| 
 Note 
 | 
DataType (and the concrete Spark SQL types) live in org.apache.spark.sql.types package.
 | 
You should use DataTypes object in your code to create complex Spark SQL types, i.e. arrays or maps.
import org.apache.spark.sql.types.DataTypes
scala> val arrayType = DataTypes.createArrayType(BooleanType)
arrayType: org.apache.spark.sql.types.ArrayType = ArrayType(BooleanType,true)
scala> val mapType = DataTypes.createMapType(StringType, LongType)
mapType: org.apache.spark.sql.types.MapType = MapType(StringType,LongType,true)
DataType has support for Scala’s pattern matching using unapply method.
???
DataType Contract
Any type in Spark SQL follows the DataType contract which means that the types define the following methods:
- 
jsonandprettyJsonto build JSON representations of a data type - 
defaultSizeto know the default size of values of a type - 
simpleStringandcatalogStringto build user-friendly string representations (with the latter for external catalogs) - 
sqlto build SQL representation 
import org.apache.spark.sql.types.DataTypes._
val maps = StructType(
  StructField("longs2strings", createMapType(LongType, StringType), false) :: Nil)
scala> maps.prettyJson
res0: String =
{
  "type" : "struct",
  "fields" : [ {
    "name" : "longs2strings",
    "type" : {
      "type" : "map",
      "keyType" : "long",
      "valueType" : "string",
      "valueContainsNull" : true
    },
    "nullable" : false,
    "metadata" : { }
  } ]
}
scala> maps.defaultSize
res1: Int = 2800
scala> maps.simpleString
res2: String = struct<longs2strings:map<bigint,string>>
scala> maps.catalogString
res3: String = struct<longs2strings:map<bigint,string>>
scala> maps.sql
res4: String = STRUCT<`longs2strings`: MAP<BIGINT, STRING>>
DataTypes — Factory Methods for Data Types
DataTypes is a Java class with methods to access simple or create complex DataType types in Spark SQL, i.e. arrays and maps.
| 
 Tip 
 | 
It is recommended to use DataTypes class to define DataType types in a schema.
 | 
DataTypes lives in org.apache.spark.sql.types package.
import org.apache.spark.sql.types.DataTypes
scala> val arrayType = DataTypes.createArrayType(BooleanType)
arrayType: org.apache.spark.sql.types.ArrayType = ArrayType(BooleanType,true)
scala> val mapType = DataTypes.createMapType(StringType, LongType)
mapType: org.apache.spark.sql.types.MapType = MapType(StringType,LongType,true)
| 
 Note 
 | 
 Simple  You may also import the  
 | 
UDTs — User-Defined Types
| 
 Caution 
 | 
FIXME |