Reputation: 5905
I have 2 RDD i.e. RDD[String]
andd RDD[String,String]
and their content are as following.
RDD[String] RDD[String,String]
mobile laptop,aa
smartphone printer,bb
desktop scanner,ya
laptop mobile,gb
printer burger,gn
I need to intersect this two RDD and need to get count of common keyword.
My output should be 3
because printer,laptop and mobile are comman.
I tried with intersection()
but didn't get it. I have done with this array but don't know how to do with RDD(because i need to work on RDD).
Here what I have tried.
tokenArray.intersect(param._1.split("/")).size > 2)
Please give me reference or hint.
Upvotes: 0
Views: 1032
Reputation: 13346
Does the following solves your problem?
val keywords = sc.parallelize(Seq("mobile", "smartphone", "desktop", "laptop", "printer"))
val data = sc.parallelize(Seq(("laptop", "aa"), ("printer", "bb"), ("scanner", "ya"),
("mobile", "gb"), ("burger", "gn")))
val keysInData = data.map(_._1)
val result = keywords.intersection(keysInData).count()
Upvotes: 1