Reputation: 19308
I have the following Spark DataFrame and would like the change the type of the nums
column:
+---------+----------------------------+
|firstname|nums |
+---------+----------------------------+
|James |[[null, null], [null, null]]|
|Michael |[[null, null], [null, null]]|
+---------+----------------------------+
Here's the type of nums
: StructField('nums', ArrayType(ArrayType(NullType(), True), True), True)
This is what I tried:
desired_type = StructField("nums", ArrayType(ArrayType(IntegerType(), True), True), True)
df = df.withColumn("nums", col("nums").cast(desired_type))
This is the error I got: IllegalArgumentException: Failed to convert the JSON string '{"metadata":{},"name":"nums","nullable":true,"type":{"containsNull":true,"elementType":{"containsNull":true,"elementType":"integer","type":"array"},"type":"array"}}' to a data type.
Here's the full example:
data2 = [
("James", [[None, None], [None, None]]),
("Michael", [[None, None], [None, None]]),
]
schema = StructType(
[
StructField("firstname", StringType(), True),
StructField("nums", ArrayType(ArrayType(NullType(), True), True), True),
]
)
df = spark.createDataFrame(data=data2, schema=schema)
desired_type = StructField("nums", ArrayType(ArrayType(IntegerType(), True), True), True)
df = df.withColumn("nums", col("nums").cast(desired_type))
Upvotes: 2
Views: 240
Reputation: 1151
The desired_type
should be created like this:
desired_type = ArrayType(ArrayType(IntegerType(), True), True)
df = df.withColumn("nums", F.col("nums").cast(desired_type))
df.printSchema()
root
|-- firstname: string (nullable = true)
|-- nums: array (nullable = true)
| |-- element: array (containsNull = true)
| | |-- element: integer (containsNull = true)
Upvotes: 2
Reputation: 31470
Can you try by casting array<array<int>>
instead of using struct type
:
Example:
df = df.withColumn("nums", col("nums").cast("array<array<int>>"))
print(df.schema)
df.printSchema()
#StructType([StructField('firstname', StringType(), True), StructField('nums', #ArrayType(ArrayType(IntegerType(), True), True), True)])
#root
# |-- firstname: string (nullable = true)
# |-- nums: array (nullable = true)
# | |-- element: array (containsNull = true)
# | | |-- element: integer (containsNull = true)
Upvotes: 1