Convert Dataset to a Typed Dataset having optional Parameters

Question

I have a Dataset that I wish to convert to a type-dataset where the type is a case class having Option for several parameters. For example using spark shell I create a case class, a encoder and (raw) Dataset:

case class Analogue(id: Long, t1: Option[Double] = None, t2: Option[Double] = None)
val df = Seq((1, 34.0), (2,3.4)).toDF("id", "t1")
implicit val analogueChannelEncoder: Encoder[Analogue] = Encoders.product[Analogue]

I want to create a Dataset from df so I try:

df.as(analogueChannelEncoder)

But this results in the error:

org.apache.spark.sql.AnalysisException: cannot resolve '`t2`' given input columns: [id, t1];

Looking at the schemas of df and analogueChannelEncoder the difference is apparent:

scala> df.schema
res3: org.apache.spark.sql.types.StructType = StructType(StructField(id,IntegerType,false), StructField(t1,DoubleType,false))

scala> analogueChannelEncoder.schema
res4: org.apache.spark.sql.types.StructType = StructType(StructField(id,LongType,false), StructField(t1,DoubleType,true), StructField(t2,DoubleType,true))

I have seen this answer but this will not work for me as my Dataset is assembled and is not a straight-forward load from a data source

How can I cast my untyped Dataset to Dataset?

D-Dᴙum · Accepted Answer

I have resolved the issue by inspecting the 'incoming' Dataset for its columns and comparing them to the columns in a Dataset. The difference that results I use to append new columns to my Dataset before then casting it as Dataset.

Convert Dataset<Row> to a Typed Dataset having optional Parameters

Answers (2)

Update (increased more columns in the case class which are options) :

Related Questions

Convert Dataset&lt;Row&gt; to a Typed Dataset having optional Parameters

Answers (2)

Update (increased more columns in the case class which are options) :

Related Questions

Convert Dataset<Row> to a Typed Dataset having optional Parameters