Ged
Ged

Reputation: 18108

pyspark UDF with null values check and if statement

This works provided no null values exist in an array passed to a pyspark UDF.

concat_udf = udf(
    lambda con_str, arr: [x + con_str for x in arr], ArrayType(StringType())
)

I am not seeing how we can adapt this with a null / None check with an If. How to adapt the following correctly below that does not work:

concat_udf = udf(lambda con_str, arr: [  if x is None: 'XXX' else: x + con_str for x in arr  ], ArrayType(StringType()))

I can find no such example. if with transform no success either.

+----------+--------------+--------------------+
|      name|knownLanguages|          properties|
+----------+--------------+--------------------+
|     James| [Java, Scala]|[eye -> brown, ha...|
|   Michael|[Spark, Java,]|[eye ->, hair -> ...|
|    Robert|    [CSharp, ]|[eye -> , hair ->...|
|Washington|          null|                null|
| Jefferson|        [1, 2]|                  []|
+----------+--------------+--------------------+

should become

+----------+--------------------+-----------------------+
|      name|knownLanguages|          properties         |
+----------+--------------------+-----------------------+
|     James| [JavaXXX, ScalaXXX]|[eye -> brown, ha...   |
|   Michael|[SparkXXX, JavaXXX,XXX]|[eye ->, hair -> ...|
|    Robert|    [CSharpXXX, XXX]|[eye -> , hair ->...   |
|Washington|                 XXX|                null   |
| Jefferson|        [1XXX, 2XXX]|                  []   |
+----------+--------------+-----------------------------+

Upvotes: 0

Views: 1084

Answers (1)

Steven
Steven

Reputation: 15318

using ternary operator, I would do something like this :

concat_udf = udf(
    lambda con_str, arr: [x + con_str if x is not None else "XXX" for x in arr]
    if arr is not None
    else ["XXX"],
    ArrayType(StringType()),
)

# OR 

concat_udf = udf(
    lambda con_str, arr: [
        x + con_str if x is not None else "XXX" for x in arr or [None]
    ],
    ArrayType(StringType()),
)

Upvotes: 1

Related Questions