Joining a large and a small RDD
val smallLookup = sc.broadcast(smallRDD.collect.toMap)
largeRDD.flatMap { case(key, value) =>
smallLookup.value.get(key).map { otherValue =>
(key, (value, otherValue))
}
}Last updated
val smallLookup = sc.broadcast(smallRDD.collect.toMap)
largeRDD.flatMap { case(key, value) =>
smallLookup.value.get(key).map { otherValue =>
(key, (value, otherValue))
}
}Last updated