dks551
dks551

Reputation: 1113

How to interact with different cassandra cluster from the same spark context

I want to migrate my old cassandra cluster data to a new cluster and thinking to write some spark jobs to do that. Is there any way to interact with multiple cassandra cluster from the same SparkContext. So that i can read the data from one cluster and write to another cluster using saveToCassandra function inside the same sparkJob.

val products = sc.cassandraTable("first_cluster","products").cache()
products.saveToCassandra("diff_cluster","products2")

Can we save the data into a different cluster ?

Upvotes: 1

Views: 963

Answers (1)

undefined_variable
undefined_variable

Reputation: 6218

Example from spark-cassandra-connector

import com.datastax.spark.connector._
import com.datastax.spark.connector.cql._

import org.apache.spark.SparkContext

def twoClusterExample ( sc: SparkContext) = {
  val connectorToClusterOne = CassandraConnector(sc.getConf.set("spark.cassandra.connection.host", "127.0.0.1"))
  val connectorToClusterTwo = CassandraConnector(sc.getConf.set("spark.cassandra.connection.host", "127.0.0.2"))

  val rddFromClusterOne = {
    // Sets connectorToClusterOne as default connection for everything in this code block
    implicit val c = connectorToClusterOne
    sc.cassandraTable("ks","tab")
  }

  {
    //Sets connectorToClusterTwo as the default connection for everything in this code block
    implicit val c = connectorToClusterTwo
    rddFromClusterOne.saveToCassandra("ks","tab")
  }

}

Upvotes: 5

Related Questions