Use TreeReduce/TreeAggregate instead of Reduce/Aggregate
PreviousAvoid the flatMap-join-groupBy patternNextHash-partition before transformation over pair RDD
Last updated
Was this helpful?
Last updated
Was this helpful?
You'll see in treeReduce/treeAggregate
function are more efficient than reduce/aggregate
.