HashPartitioner

HashPartitioner is a Partitioner that uses partitions configurable number of partitions to shuffle data around.

Table 1. HashPartitioner Attributes and Method
Property Description

numPartitions

Exactly partitions number of partitions

getPartition

0 for null keys and Java’s Object.hashCode for non-null keys (modulo partitions number of partitions or 0 for negative hashes).

equals

true for HashPartitioners with partitions number of partitions. Otherwise, false.

hashCode

Exactly partitions number of partitions

Note
HashPartitioner is the default Partitioner for coalesce transformation with shuffle enabled, e.g. calling repartition.

It is possible to re-shuffle data despite all the records for the key k being already on a single Spark executor (i.e. BlockManager to be precise). When HashPartitioner's result for k1 is 3 the key k1 will go to the third executor.

results matching ""

    No results matching ""