pikapoo
pikapoo

Reputation: 77

How to convert a Dataset[Seq[T]] to Dataset[T] in Spark

How do I convert a Dataset[Seq[T]] to Dataset[T]?

For example, Dataset[Seq[Car]] to Dataset[Car].

Upvotes: 1

Views: 1046

Answers (1)

T. Gawęda
T. Gawęda

Reputation: 16086

You can do flatMap:

val df = Seq(Seq(1, 2, 3), Seq(4, 5, 6, 7)).toDF("s").as[Seq[Int]];
df.flatMap(x => x.toList)

You can also try explode function:

df.select(explode('s)).select("col.*").as[Car]

Full example:

import org.apache.spark.sql.functions._
case class Car(i : Int);
val df = Seq(List(Car(1), Car(2), Car(3))).toDF("s").as[List[Car]];
val df1 = df.flatMap(x  => x.toList)
val df2 = df.select(explode('s)).select("col.*").as[Car]

Upvotes: 3

Related Questions