scala joinWithCassandraTable result to dataframe

Question

I'm using Datastax spark-Cassandra-connector to access some data in Cassandra. My requirement is to Join an RDD with a Cassandra table, fetch the result and store it in the hive table.

Im using joinWithCassandraTable to join the cassadra table. After the join the resuting RDD looks like below

com.datastax.spark.connector.rdd.CassandraJoinRDD[org.apache.spark.sql.Row, 
com.datastax.spark.connector.CassandraRow] = 
CassandraJoinRDD[17] at RDD at CassandraRDD.scala:19

I tried below steps to convert to the data frame but none of the approaches is working.

val data=joinWithRDD.map{
   case(_, cassandraRow) =>    Row(cassandraRow.columnValues:_*)
}

sqlContext.createDataFrame(data,schema)

I'm getting below error

java.lang.ClassCastException: cannot assign instance of
   scala.collection.immutable.List$SerializationProxy to field 
   org.apache.spark.rdd.RDD.org$apache$spark$rdd$RDD$$dependencies_ of 
   type scala.collection.Seq in instance of org.apache.spark.rdd.MapPartitionsRDD

Can you please help me in converting joinWithCassandraTable to a dataframe?

scala joinWithCassandraTable result to dataframe

Answers (1)

Related Questions