Nitin
Nitin

Reputation: 476

Modifying element in nested array of struct

I have one nested array of struct and I would like to modify column name to something else as given in example below.

Source format

 |-- HelloWorld: array (nullable = true)
 |    |-- element: struct (containsNull = true)
 |    |    |-- version: string (nullable = true)
 |    |    |-- abc-version: string (nullable = true) ----->This part needs to renamed
 |    |    |-- again_something: array (nullable = true)
 |    |    |    |-- element: map (containsNull = true)
 |    |    |    |    |-- key: string
 |    |    |    |    |-- value: string (valueContainsNull = true)

Output format should look like below.

 |-- HelloWorld: array (nullable = true)
 |    |-- element: struct (containsNull = true)
 |    |    |-- version: string (nullable = true)
 |    |    |-- abc_version: string (nullable = true). ----->This part has changed
 |    |    |-- again_something: array (nullable = true)
 |    |    |    |-- element: map (containsNull = true)
 |    |    |    |    |-- key: string
 |    |    |    |    |-- value: string (valueContainsNull = true)

I tried different withField, F.expr to transform the column name, but didn't really work well.

Please help.

Upvotes: 1

Views: 66

Answers (1)

wwnde
wwnde

Reputation: 26676

I would recast it with the same dtype while changing the column name

 df3 = df.withColumn("HelloWorld",F.expr("transform(HelloWorld, x -> struct(cast((x['abc-version']) as integer) as abc_version, x.version,x.gain_something))"))


root
 |-- HelloWorld: array (nullable = true)
 |    |-- element: struct (containsNull = false)
 |    |    |-- abc_version: integer (nullable = true)
 |    |    |-- version: string (nullable = true)
 |    |    |-- gain_something: array (nullable = true)
 |    |    |    |-- element: map (containsNull = true)
 |    |    |    |    |-- key: string
 |    |    |    |    |-- value: string (valueContainsNull = true)

Upvotes: 1

Related Questions