Use TreeReduce/TreeAggregate instead of Reduce/Aggregate
You'll see in TreeReduce and TeeAggregate Demystifed treeReduce/treeAggregate
function are more efficient than reduce/aggregate
.
PreviousAvoid the flatMap-join-groupBy patternNextHash-partition before transformation over pair RDD
Last updated