Reputation: 1339
I'm trying to create empty struct column in pyspark. For array this works
import pyspark.sql.functions as F
df = df.withColumn('newCol', F.array([]))
but this gives me an error.
df = df.withColumn('newCol', F.struct())
I saw similar question but for scala not pyspark so it doesn't really help me.
Upvotes: 3
Views: 5211
Reputation: 696
Actually the array is not really empty, because it has an empty element. You should instead consider something like this:
df = df.withColumn('newCol', F.lit(None).cast(T.StructType())
PS: it's a late conversion of my comment into an answer, as it has been proposed - I hope it will help even if it's late after the OP's question
Upvotes: 2
Reputation: 23
If you know the schema of the struct column, you can use the function from_json
as follows
struct_schema = StructType([
StructField('name', StringType(), False),
StructField('surname', StringType(), False),
])
df = df.withColumn(
'newCol', F.from_json(psf.lit(""), struct_schema)
)
Upvotes: 1