Chuck
Chuck

Reputation: 1293

how to filter RDD map in Scala by elements not in tuple

I have a word count example. If I wanted to filter out one common word I can do like this, where wordList is a tuple:

val filterWords = wordList.filter(x => x != "to")

But it's more useful to create a list of words to filter on:

val filterWords = ("a", "to", "the", "of", "I", "you")

How do you use that in the filter above? Or, how can I do like this, which is done in SQL?

where wordList not in ("a", "to", "the", "of", "I", "you")

Upvotes: 0

Views: 98

Answers (2)

Levi Ramsey
Levi Ramsey

Reputation: 20551

val filterWords = Set("a", "to", "the", "of", "I", "you")

wordList.filterNot(filterWords.contains(_))

filterWords.contains will return true if and only if the element of wordList under consideration is in filterWords. filterNot will pass through the elements for which the contains call returns false.

Upvotes: 2

Saskia
Saskia

Reputation: 1056

What you created is a tuple not a list.

val filterWords = List("a", "to", "the", "of", "I", "you")

Then you can use

wordlist.filter(x => filterwords.contains(x))

Also have a look at the full api of List

Upvotes: 0

Related Questions