Reputation: 195
How do I select the "cat.item.category"
columns in PySpark? The schema is as follows,
root
|-- result: struct (nullable = true)
| |-- active: string (nullable = true)
| |-- cat_item.category: struct (nullable = true)
| | |-- display_value: string (nullable = true)
| | |-- link: string (nullable = true)
| |-- number: string (nullable = true)
| |-- sys_id: string (nullable = true)
I tried the following but I get an error,
df22 = df22.select("result.active", "result.cat_item.category.display_value", "result.cat_item.category.link", "result.number", "result.sys_id")
How do I select the struct columns?
Upvotes: 0
Views: 402
Reputation: 32710
The field name contains a dot .
, you need to escape it using backtick `:
df22.select("result.`cat_item.category`.display_value")
Upvotes: 1