fattah.safa
fattah.safa

Reputation: 966

Accessing Cassandra nodes in Spark

I have two Cassandra nodes and I'm developing a Java-Spark application.

I have one Spark Master and two slaves. The following code is used to connect to single Cassandra node:

sparkConf.set("spark.cassandra.connection.host", "server");

How can I add additional Cassandra nodes?

Upvotes: 3

Views: 423

Answers (2)

Vidya
Vidya

Reputation: 30310

The documentation is quite clear:

new SparkConf(true)
   .set("spark.cassandra.connection.host", "192.168.123.10")

And just below:

"Multiple hosts can be passed in using a comma separated list ("127.0.0.1,127.0.0.2"). These are the initial contact points only, all nodes in the local DC will be used upon connecting."

In other words, you just need to connect to the Spark master, which knows about the other machines in the cluster through the resource manager. The comma-separated list is useful when you want to connect to multiple clusters.

Upvotes: 2

Achilleus
Achilleus

Reputation: 1944

You could try doing this if you are using scala. I could not find anything wrt Python though.

val connectorToClusterOne = CassandraConnector(sc.getConf.set("spark.cassandra.connection.host", "127.0.0.1"))
val connectorToClusterTwo = CassandraConnector(sc.getConf.set("spark.cassandra.connection.host", "127.0.0.2"))


implicit val c = connectorToClusterOne
sc.cassandraTable("ks","tab")

implicit val c = connectorToClusterTwo
rddFromClusterOne.saveToCassandra("ks","tab")

Good Luck!!

Upvotes: 2

Related Questions