Reputation: 1
in scala spark we can filter if column A value is not equal to column B or same dataframe as
df.filter(col("A")=!=col("B"))
How we can do this same in Pyspark ?
I have tried differemt options like
df.filter(~(df["A"] == df["B"]))
and !=
operator but got errors
Upvotes: 0
Views: 1689
Reputation: 1167
Take a look at this snippet:
df = spark.createDataFrame([(1, 2), (1, 1)], "id: int, val: int")
df.show()
+---+---+
| id|val|
+---+---+
| 1| 2|
| 1| 1|
+---+---+
from pyspark.sql.functions import col
df.filter(col("id") != col("val")).show()
+---+---+
| id|val|
+---+---+
| 1| 2|
+---+---+
Upvotes: 2