Reputation: 350
I want to process each row in an RDD with comma separated values. What I am trying to achieve is to set all values close to zero to actual zeros. Here is what I did.
val newRDD = oldRDD
.map (line => line.split(","))
.map (line => for(value <- line) {
if(value.toDouble >= -0.01 && value.toDouble <= 0.01)
0.toString()
else
value
}
)
All I am getting is just parenthesis () for all rows. Am I making some stupid mistake?
Thanks.
Upvotes: 1
Views: 317
Reputation: 16096
You should add yield
keyword, so you will mark that for loop returns list of values:
.map (line => for(value <- line) yield {
if(value.toDouble >= -0.01 && value.toDouble <= 0.01)
"0"
else
value
})
You can read it: for every value from line
collection, return - yield
value that: if // etc
You can also use DataFrame API to load Comma Separated file
Upvotes: 3