Reputation: 7745
I have a dataframe with the following schema:
root
|-- Id: long (nullable = true)
|-- LastUpdate: string (nullable = true)
|-- Info: array (nullable = true)
| |-- element: struct (containsNull = true)
| | |-- Purchase: array (nullable = true)
| | | |-- element: struct (containsNull = true)
| | | | |-- Amount: long (nullable = true)
| | | | |-- Name: string (nullable = true)
| | | | |-- Type: string (nullable = true)
How can I select the Amount
column such that I can cast it?
Tried:
df = df.withColumn("Info.Purchase.Amount", df["Info.Purchase.Amount"].cast(DoubleType()))
But got:
org.apache.spark.sql.AnalysisException: cannot resolve '`Info`.`Purchase`['Amount']'
Upvotes: 0
Views: 1789
Reputation: 219
You can use below method to extract nested array:
df.select(col("info").getField("Purchase").getField("Amount")).show()
This will give you list of all amount column. you can cast that.
Upvotes: 1