Reputation: 2074
Title says it all:
Is there an equivalent to the SPARK SQL LATERAL VIEW
command in the Spark API so that I can generate a column from a UDF that contains a struct of multiple columns worth of data, and then laterally spread the columns in the struct into the parent dataFrame as individual columns?
Something equivalent to df.select(expr("LATERAL VIEW udf(col1,col2...coln)"))
Upvotes: 0
Views: 1237
Reputation: 2074
I solved this by selecting the udf into a column:
val dfWithUdfResolved = dataFrame.select(calledUdf()).as("tuple_column"))
... then ...
dfWithUdfResolved
.withColumn("newCol1", $"tuple_column._1")
.withColumn("newCol2", $"tuple_column._2")
// ...
.withColumn("newColn", $"tuple_column._n")
Basically using tuple notation to pull values out of the column into new discrete columns.
Upvotes: 0