Avoid List of Iterators
Often when reading in a file [22], we want to work with the individual values contained in each line separated by some delimiter. Splitting a delimited line is a trivial operation:
But the issue here is the returned RDD will be an iterator composed of iterators. What we want is the individual values obtained after calling the split function. In other words, we need an Array[String]
not an Array[Array[String]]
. For this we would use the flatMap
function. For those with a functional programming background, using a flatMap
operation is nothing new. But if you are new to functional programming it’s a great operation to become familiar with.
When we run the program we see these results:
As we can see the map example returned an Array containing 3 Array[String]
instances, while the flatMap
call returned individual values contained in one Array.
Last updated