Reputation: 821
# DataframeA and DataframeB match:
DataframeA:
col: Name "Ali", "Bilal", "Ahsan"
DataframeB:
col: Name "Ali", "Bilal", "Ahsan"
# DataframeC and DataframeD DO NOT match:
DataframeC:
col: Name "Ali", "Ahsan", "Bilal"
DataframeD:
col: Name "Ali", "Bilal", "Ahsan"
I want to match column values in place, any help would be greatly appreciated.
Upvotes: 0
Views: 103
Reputation: 142
use set
for comparison in python.
DataframeC.columns
-> ["Ali", "Ahsan", "Bilal"]
DataframeD.columns
-> ["Ali", "Bilal", "Ahsan"]
DataframeC.columns == DataframeD.columns
-> False
set(DataframeC.columns) == set(DataframeD.columns)
-> True
Upvotes: 0
Reputation: 2431
Use below Scala code as a reference and translate it into python. Update val check
line as per your dataframe
name.
scala> val w = Window.orderBy(lit(1))
scala> val check = dfA.withColumn("rn", row_number.over(w)).alias("A").join(dfB.withColumn("rn", row_number.over(w)).alias("B"), List("rn"),"left").withColumn("check", when(col("A.name") === col("B.name"), lit("match")).otherwise(lit("not match"))).select("check").distinct.count
scala> if (check == 1){
| println("matched")} else (println("not matched"))
Upvotes: 1