Reputation: 307
I'm trying to create a new column in a pyspark dataframe that is predicated on the contents of another column. The other column has all integers, and I want the new column to be encoded with either 1's or 0's.
import pyspark.sql.functions as F
df2 = df2.withColumn('Industrial', F.when(F.col('CODE') in (1,2,3,4), 1).otherwise(0))
This doesn't work since it wants just Boolean logic. Is there a work around for this?
EDIT: Could still be useful for others since it creates a new column and does a little more than just a check of isin().
Upvotes: 0
Views: 680
Reputation: 610
Use col.isin
method
df2 = df2.withColumn('Industrial', F.when(F.col('CODE').isin((1,2,3,4)), 1).otherwise(0))
Upvotes: 1