Reputation: 79
I am new to Spark and am trying to code in scala. I have an RDD which consists of data in the form :
1: 2 3 5 2: 5 6 7 3: 1 8 9 4: 1 2 4
and another list in the form [1,4,8,9]
I need to filter the RDD such that it takes those lines in which either the value before ':' is present in the list or if any of the values after ':' are present in the list.
I have written the following code:
val links = linksFile.filter(t => {
val l = t.split(": ")
root.contains(l(0).toInt) ||
for(x<-l(0).split(" ")){
root.contains(x.toInt)
}
})
linksFile is the RDD and root is the list.
But this doesn't work. any suggestions??
Upvotes: 2
Views: 887
Reputation: 40500
For-comprehension without a yield
doesn't ... well ... yield :)
But you don't really need for-comprehension (or any "loop" for that matter) here.
Something like this:
linksFile.map(
_.split(": ").map(_.toInt)
).filter(_.exits(list.toSet))
.map(_.mkString)
should do it.
Upvotes: 0
Reputation: 18424
You're close: the for-loop just doesn't actually use the value computed inside it. You should use the exists
method instead. Also I think you want l(1)
, not l(0)
for the second check:
val links = linksFile.filter(t => {
val l = t.split(": ")
root.contains(l(0).toInt) ||
l(1).split(" ").exists { x =>
root.contains(x.toInt)
}
})
Upvotes: 3