Reputation: 51
How to create a new column in Dataframe DF based on give condition. I have array of String and want to compare that with existing dataframe
dataframe DF
+-------------------+-----------+
| DiffColumnName| Datatype|
+-------------------+-----------+
| DEST_COUNTRY_NAME| StringType|
|ORIGIN_COUNTRY_NAME| StringType|
| COUNT|IntegerType|
+-------------------+-----------+
and Array of String having column names( this is not constant and can be changed)
val diffcolarray = Array("ORIGIN_COUNTRY_NAME", "COUNT")
I want to create a new column in DF based on a condition that if columns present in diffcolarray is also present in Dataframe's column DiffColumnName then yes else no.
I have tried below options however getting error
val newdf = df.filter(when(col("DiffColumnName") === df.columns.filter(diffcolarray.contains(_)), "yes").otherwise("no")).as("issue")
val newdf = valdfe.filter(when(col("DiffColumnName") === df.columns.map(diffcolarray.contains(_)), "yes").otherwise("no")).as("issue")
Looks like when comparing there is datatype mismatch.Output should be something like this. Any suggestion would be helpful. Thank you
+-------------------+-----------+----------+
| DiffColumnName| Datatype| Issue |
+-------------------+-----------+----------+
| DEST_COUNTRY_NAME| StringType| NO |
|ORIGIN_COUNTRY_NAME| StringType| NO |
| COUNT|IntegerType| YES |
+-------------------+-----------+----------+
Upvotes: 0
Views: 1091
Reputation: 340
This can give you the desired output.
df.withColumn("Issue",when(col("DiffColumnName").isin(diffcolarray: _*),"YES").otherwise("NO")).show(false)
Upvotes: 1