Reputation: 3975
I'm having some troubles while trying to connect to a remote Cassandra using Apache-Spark and Scala. I successfully managed to connect in past, in the same way, with MongoDb.
This time I really don't understand why I'm getting the following error:
Failed to open native connection to Cassandra at {127.0.0.1}:9042
I guess it's a dependency and version problem but I was not able to find anything related to this issue in particular, both on documentation and on other questions.
I essentially manage to connect via ssh-tunnel to my server using jsch and all works fine. Then, I'm successfully able to connect to the local apache-spark using SparkConnectionFactory.scala:
package connection
import org.apache.spark.{SparkConf, SparkContext}
class SparkConnectionFactory {
var sparkContext : SparkContext = _
def initSparkConnection = {
val configuration = new SparkConf(true).setMaster("local[8]")
.setAppName("my_test")
.set("spark.cassandra.connection.host", "localhost")
.set("spark.cassandra.input.consistency.level","ONE")
.set("spark.driver.allowMultipleContexts", "true")
val sc = new SparkContext(configuration)
sparkContext = sc
}
def getSparkInstance : SparkContext = {
sparkContext
}
}
And calling it in my Main.scala:
val sparkConnectionFactory = new SparkConnectionFactory
sparkConnectionFactory.initSparkConnection
val sc : SparkContext = sparkConnectionFactory.getSparkInstance
But, when I try to select all the items inside a Cassandra table using:
val rdd = sc.cassandraTable("my_keyspace", "my_table")
rdd.foreach(println)
I get the error I wrote above.
On my server I installed Scala ~v2.11.6
, Spark ~v2.1.1
, SparkSQL ~v2.1.1
. Of course I have 8 cores and a replication factor of 1. On my pom.xml
I have:
. . .
<properties>
<scala.version>2.11.6</scala.version>
</properties>
<dependencies>
<dependency>
<groupId>org.scala-lang</groupId>
<artifactId>scala-library</artifactId>
<version>${scala.version}</version>
</dependency>
. . .
<!-- https://mvnrepository.com/artifact/org.apache.spark/spark-core_2.10 -->
<dependency>
<groupId>org.apache.spark</groupId>
<artifactId>spark-core_2.11</artifactId>
<version>2.1.1</version>
</dependency>
<!-- https://mvnrepository.com/artifact/com.datastax.spark/spark-cassandra-connector_2.10 -->
<dependency>
<groupId>com.datastax.spark</groupId>
<artifactId>spark-cassandra-connector_2.11</artifactId>
<version>2.0.3</version>
</dependency>
<!-- https://mvnrepository.com/artifact/org.apache.spark/spark-sql_2.10 -->
<dependency>
<groupId>org.apache.spark</groupId>
<artifactId>spark-sql_2.11</artifactId>
<version>2.1.1</version>
</dependency>
<!-- https://mvnrepository.com/artifact/commons-codec/commons-codec -->
<dependency>
<groupId>commons-codec</groupId>
<artifactId>commons-codec</artifactId>
<version>1.9</version>
</dependency>
</dependencies>
Is my issue caused by conflicting versions? If yes, how can I fix this? If not, any hint on what's causing it?
Thanks in advance.
Upvotes: 2
Views: 823
Reputation: 191681
I'm forwarding port 9042 to 8988
Then that's the port you need to connect to
.set("spark.cassandra.connection.port", 8988)
Upvotes: 1