Use TreeReduce/TreeAggregate instead of Reduce/Aggregate
You'll see in TreeReduce and TeeAggregate Demystifed treeReduce/treeAggregate function are more efficient than reduce/aggregate.
PreviousAvoid the flatMap-join-groupBy patternNextHash-partition before transformation over pair RDD
Last updated
Was this helpful?