Kaja
Kaja

Reputation: 3057

adding new column to a pyspark dataframe based on other column

I would like to add a new column to a dataframe based on another column using WHEN. I have the folowing code:

from pyspark.sql.functions import col, expr, when
df2=df.withColumn("test1",when(col("Country")=="DE","EUR").when(col("Country")=="PL","PLN").otherweise("Unknown"))

but I get the error: 'Column' object is not callable How can I fix the problem?

Upvotes: 1

Views: 284

Answers (1)

notNull
notNull

Reputation: 31470

You have a typo in your statement.

  • otherweise change to otherwise

df=spark.createDataFrame([("DE",),("PL",),("PO",)],["Country"])
df.withColumn("test1",when(col("country") == "DE", "EUR").when(col("country") == "PL", "PLN").otherwise("Unknown")).show()
#+-------+-------+
#|Country|  test1|
#+-------+-------+
#|     DE|    EUR|
#|     PL|    PLN|
#|     PO|Unknown|
#+-------+-------+

Upvotes: 1

Related Questions