Reputation: 25
What is possible reason why this code doesn't work and how to fix it? Yes, f1, f2, f3, f4 fields are not using, but in productive code getList has an argument xml_data, so I pass xml_data field to method and get List[AnyRef]
def getList: List[AnyRef] = {
List("string",
new Integer(20),
Decimal(BigDecimal(10000), 10, 2),
Timestamp.valueOf("2021-01-01 10:00:00"),
List("1","2","3"),
List(1,2,3),
List(Decimal(BigDecimal(10000), 10, 2)),
List(Timestamp.valueOf("2021-01-01 10:00:00")),
null,
List(null))
}
...
val schema = StructType(Seq(
StructField("string", StringType),
StructField("int", IntegerType),
StructField("decimal", DecimalType(10, 2)),
StructField("timestamp", TimestampType),
StructField("array_string", ArrayType(StringType)),
StructField("array_int", ArrayType(IntegerType)),
StructField("array_decimal", ArrayType(DecimalType(10, 2))),
StructField("array_timestamp", ArrayType(TimestampType)),
StructField("array_int", ArrayType(IntegerType)),
StructField("array_string", ArrayType(StringType))
))
val encoder = RowEncoder(schema)
import spark.implicits._
List((1, 2, 3, 4))
.toDF("f1", "f2", "f3", "f4")
.as[Rec]
.map(rec => {
Row(getList)
})(encoder)
.show()
Upvotes: 0
Views: 1149
Reputation: 15086
Row
doesn't take a List
but a varargs parameter. To expand that list into varargs you have to use this weird type ascription _*
. Otherwise the whole list will be interpreted as the first (and only) parameter.
List((1, 2, 3, 4))
.toDF("f1", "f2", "f3", "f4")
.as[Rec]
.map(rec => {
Row(getList :_*)
})(encoder)
.show()
Upvotes: 5