Drop rows of Spark DataFrame that contain specific value in column using Scala

Question

I am tryping to drop rows of a spark dataframe which contain a specific value in a specific row. For example, if i have the following DataFrame, i´d like to drop all rows which have "two" in column "A". So i´d like to drop the rows with index 1 and 2. I want to do this using Scala 2.11 and Spark 2.4.0.

     A      B   C
0    one    0   0
1    two    2   4
2    two    4   8
3    one    6  12
4  three    7  14

I tried something like this:

df = df.filer(_.A != "two")

or

df = df.filter(df("A") != "two")

Anyway both did not work. Any suggestions how that can be done?

gasparms · Accepted Answer

Try:

df.filter(not($"A".contains("two")))

Or if you look for exact match:

df.filter(not($"A".equalTo("two")))

Drop rows of Spark DataFrame that contain specific value in column using Scala

Answers (2)

Related Questions