AlienDeg
AlienDeg

Reputation: 1339

How to create empty struct in pyspark?

I'm trying to create empty struct column in pyspark. For array this works

import pyspark.sql.functions as F
df = df.withColumn('newCol', F.array([]))

but this gives me an error.

df = df.withColumn('newCol', F.struct())

I saw similar question but for scala not pyspark so it doesn't really help me.

Upvotes: 3

Views: 5211

Answers (2)

Christophe
Christophe

Reputation: 696

Actually the array is not really empty, because it has an empty element. You should instead consider something like this:

df = df.withColumn('newCol', F.lit(None).cast(T.StructType())

PS: it's a late conversion of my comment into an answer, as it has been proposed - I hope it will help even if it's late after the OP's question

Upvotes: 2

Demet Sude Saplık
Demet Sude Saplık

Reputation: 23

If you know the schema of the struct column, you can use the function from_json as follows

    struct_schema = StructType([
       StructField('name', StringType(), False),
       StructField('surname', StringType(), False),
    ])

    df = df.withColumn(
      'newCol', F.from_json(psf.lit(""), struct_schema)
    )

Upvotes: 1

Related Questions