Mikel San Vicente
Mikel San Vicente

Reputation: 3863

Spark Frameless withColumnRenamed nested field

Let's say I have the following code

case class MyTypeInt(a: String, b: MyType2)
case class MyType2(v: Int)
case class MyTypeLong(a: String, b: MyType3)
case class MyType3(v: Long)

val typedDataset = TypedDataset.create(Seq(MyTypeInt("v", MyType2(1))))
typedDataset.withColumnRenamed(???, typedDataset.colMany('b, 'v).cast[Long]).as[MyTypeLong]

How can I implement this transformation when the field that I am trying to transform is nested? the signature of withColumnRenamed asks for a Symbol in the first parameter so I don't know how to do this...

Upvotes: 2

Views: 386

Answers (1)

Oli
Oli

Reputation: 10406

withColumnRenamed does not allow you to transform a column. To do that, you should use withColumn. One approach would then be to cast the column and recreate the struct.

scala> val new_ds = ds.withColumn("b", struct($"b.v" cast "long" as "v")).as[MyTypeLong]
scala> new_ds.printSchema
root
|-- a: string (nullable = true)
|-- b: struct (nullable = false)
|    |-- v: long (nullable = true) 

Another approach would be to use map and build the object yourself:

ds.map{ case MyTypeInt(a, MyType2(b)) => MyTypeLong(a, MyType3(b)) } 

Upvotes: -1

Related Questions