ZhongBot
ZhongBot

Reputation: 129

Cassandra Scala Spark - saving RDD to Cassandra

I have the following RDD

RDD[(String, Seq[((String, Double), Int)])]

An example would be:

RDD["a", Seq[(("b", 2.0), 1), (("c", 3.0), 2)]]

And I want to insert into my Cassandra table with the following schema

String (PK), String, Double, Int

In the end, for the given example, I will have the following in my DB

"a", "b", 2.0, 1
"a", "c", 3.0, 2

What is the Scala code which does this? I tried to use saveToCassandra, but my input isn't in the form of RDD[(String, String, Double, Int)]. Should I first flatten it?

Upvotes: 0

Views: 638

Answers (1)

zero323
zero323

Reputation: 330443

All you need here is a flatMap:

import org.apache.spark.rdd.RDD

val rdd: RDD[(String, Seq[((String, Double), Int)])] = ???
val flattened: RDD[(String, String, Double, Int)] = rdd.flatMap{
  case (k, vs) => vs.map{case ((v1, v2), v3) => (k, v1, v2, v3)}}

Upvotes: 1

Related Questions