Reputation: 1330
I have written a function that takes condition from parameters file and based on condition add the column value; but i am constantly getting error TypeError: condition should be a Column
condition = "type_txt = 'clinic'"
input_df = input_df.withColumn(
"prm_data_category",
F.when(condition, F.lit("clinic")) # this doesn't work
.when(F.col("type_txt") == 'office', F.lit("office")) # this works
.otherwise(F.lit("other")),
)
Is there any way to use condition as sql condition so it is easy to pass via parameter instead of col?
Upvotes: 3
Views: 7556
Reputation: 8410
You can use sql expr
using F.expr
from pyspark.sql import functions as F
condition = "type_txt = 'clinic'"
input_df1 = input_df.withColumn(
"prm_data_category",
F.when(F.expr(condition), F.lit("clinic"))
.when(F.col("type_txt") == 'office', F.lit("office"))
.otherwise(F.lit("other")),
)
Upvotes: 3