Rimer
Rimer

Reputation: 2074

Spark : Is there an equivalent to spark SQL's LATERAL VIEW in the Spark API?

Title says it all:

Is there an equivalent to the SPARK SQL LATERAL VIEW command in the Spark API so that I can generate a column from a UDF that contains a struct of multiple columns worth of data, and then laterally spread the columns in the struct into the parent dataFrame as individual columns?

Something equivalent to df.select(expr("LATERAL VIEW udf(col1,col2...coln)"))

Upvotes: 0

Views: 1237

Answers (1)

Rimer
Rimer

Reputation: 2074

I solved this by selecting the udf into a column:

val dfWithUdfResolved = dataFrame.select(calledUdf()).as("tuple_column"))

... then ...

dfWithUdfResolved
  .withColumn("newCol1", $"tuple_column._1")
  .withColumn("newCol2", $"tuple_column._2")
  // ...
  .withColumn("newColn", $"tuple_column._n")

Basically using tuple notation to pull values out of the column into new discrete columns.

Upvotes: 0

Related Questions