John Doe
John Doe

Reputation: 10203

How to add an empty array using when and otherwise in pyspark

How can i add an empty array when using df.withColomn when() and otherwise(***empty_array***)
New column type is T.ArrayType(T.StringType()) from UDF

I want to avoid ending up with NaN values.

Upvotes: 0

Views: 894

Answers (2)

dsk
dsk

Reputation: 2003

Try below - Create a column with None value and cast to Array()

df_b = df_b.withColumn("empty_array", F.when(F.col("rn") == F.lit("1"), (None))).withColumn("empty_array", F.col("empty_array").cast(T.ArrayType(T.StringType())))
df_b.show()



 root
 |-- col1: string (nullable = true)
 |-- col2: string (nullable = true)
 |-- rn: integer (nullable = true)
 |-- case_condition: integer (nullable = true)
 |-- empty_array: array (nullable = true)
 |    |-- element: string (containsNull = true)

Upvotes: 0

Shubham Jain
Shubham Jain

Reputation: 5526

Simply use array(lit(None))

df.select(when(col('target_bool')=='true',array(lit(1))).otherwise(array(lit(None)))).show()

Upvotes: 2

Related Questions