Arslan Ali
Arslan Ali

Reputation: 1

inequality test of two columns from same dataframe in pyspark

in scala spark we can filter if column A value is not equal to column B or same dataframe as df.filter(col("A")=!=col("B")) How we can do this same in Pyspark ?

I have tried differemt options like df.filter(~(df["A"] == df["B"])) and != operator but got errors

Upvotes: 0

Views: 1689

Answers (1)

Bartosz Gajda
Bartosz Gajda

Reputation: 1167

Take a look at this snippet:

df = spark.createDataFrame([(1, 2), (1, 1)], "id: int, val: int")
df.show()
+---+---+
| id|val|
+---+---+
|  1|  2|
|  1|  1|
+---+---+

from pyspark.sql.functions import col

df.filter(col("id") != col("val")).show()
+---+---+
| id|val|
+---+---+
|  1|  2|
+---+---+


Upvotes: 2

Related Questions