Hash-partition before transformation over pair RDD
val wordPairsRDD = rdd.map(word => (word, 1)).
partitonBy(new HashPartition(4))
val wordCountsWithReduce = wordPairsRDD
.reduceByKey(_ + _)
.collect()
PreviousUse TreeReduce/TreeAggregate instead of Reduce/AggregateNextUse coalesce to repartition in decrease number of partition
Last updated