Reputation: 7886
I have the code below where I'm trying to create a Spark DataFrame with a field that is a struct. What should I replace ???
with to get this to work.
import org.apache.spark.sql.types._
import org.apache.spark.sql.{DataFrame, Row, SparkSession}
val spark: SparkSession = SparkSession.builder()
.appName("NodesLanesTest")
.getOrCreate()
val someData = Seq(
Row(1538161836000L, 1538075436000L, "cargo3", 3L, ???("Chicago", "1234"))
)
val someSchema = StructType(
List(
StructField("ata", LongType, nullable = false),
StructField("atd", LongType, nullable = false),
StructField("cargo", StringType, nullable = false),
StructField("createdDate", LongType, nullable = false),
StructField("destination",
StructType(List(
StructField("name", StringType, nullable = false),
StructField("uuid", StringType, nullable = false)
))))
val someDF = spark.createDataFrame(
spark.sparkContext.parallelize(someData),
StructType(someSchema)
)
Upvotes: 2
Views: 2114
Reputation: 1217
You're missing a Row object. When you create a dataframe from a Sequence of Row
objects, the StructType
are expected to be represented as Row
objects, so it must work for you:
val someData = Seq(
Row(1538161836000L, 1538075436000L, "cargo3", 3L, Row("Chicago", "1234"))
)
Hope it helps.
Upvotes: 4