AndreaM16
AndreaM16

Reputation: 3975

Unable to connect to remote Cassandra via Spark + Scala

I'm having some troubles while trying to connect to a remote Cassandra using Apache-Spark and Scala. I successfully managed to connect in past, in the same way, with MongoDb.

This time I really don't understand why I'm getting the following error:

Failed to open native connection to Cassandra at {127.0.0.1}:9042

I guess it's a dependency and version problem but I was not able to find anything related to this issue in particular, both on documentation and on other questions.

I essentially manage to connect via ssh-tunnel to my server using jsch and all works fine. Then, I'm successfully able to connect to the local apache-spark using SparkConnectionFactory.scala:

package connection

import org.apache.spark.{SparkConf, SparkContext}

class SparkConnectionFactory {

  var sparkContext : SparkContext = _

  def initSparkConnection = {
    val configuration = new SparkConf(true).setMaster("local[8]")
                        .setAppName("my_test")
                        .set("spark.cassandra.connection.host", "localhost")
                        .set("spark.cassandra.input.consistency.level","ONE")
                        .set("spark.driver.allowMultipleContexts", "true")
    val sc = new SparkContext(configuration)
    sparkContext = sc
  }

  def getSparkInstance : SparkContext = {
    sparkContext
  }

}

And calling it in my Main.scala:

val sparkConnectionFactory = new SparkConnectionFactory
sparkConnectionFactory.initSparkConnection
val sc : SparkContext = sparkConnectionFactory.getSparkInstance

But, when I try to select all the items inside a Cassandra table using:

val rdd = sc.cassandraTable("my_keyspace", "my_table")
rdd.foreach(println) 

I get the error I wrote above.

On my server I installed Scala ~v2.11.6, Spark ~v2.1.1, SparkSQL ~v2.1.1. Of course I have 8 cores and a replication factor of 1. On my pom.xml I have:

. . .
<properties>
    <scala.version>2.11.6</scala.version>
</properties>

<dependencies>
    <dependency>
        <groupId>org.scala-lang</groupId>
        <artifactId>scala-library</artifactId>
        <version>${scala.version}</version>
    </dependency>

    . . .

    <!-- https://mvnrepository.com/artifact/org.apache.spark/spark-core_2.10 -->
    <dependency>
        <groupId>org.apache.spark</groupId>
        <artifactId>spark-core_2.11</artifactId>
        <version>2.1.1</version>
    </dependency>

    <!-- https://mvnrepository.com/artifact/com.datastax.spark/spark-cassandra-connector_2.10 -->
    <dependency>
        <groupId>com.datastax.spark</groupId>
        <artifactId>spark-cassandra-connector_2.11</artifactId>
        <version>2.0.3</version>
    </dependency>

    <!-- https://mvnrepository.com/artifact/org.apache.spark/spark-sql_2.10 -->
    <dependency>
        <groupId>org.apache.spark</groupId>
        <artifactId>spark-sql_2.11</artifactId>
        <version>2.1.1</version>
    </dependency>

    <!-- https://mvnrepository.com/artifact/commons-codec/commons-codec -->
    <dependency>
        <groupId>commons-codec</groupId>
        <artifactId>commons-codec</artifactId>
        <version>1.9</version>
    </dependency>

</dependencies>    

Is my issue caused by conflicting versions? If yes, how can I fix this? If not, any hint on what's causing it?

Thanks in advance.

Upvotes: 2

Views: 823

Answers (1)

OneCricketeer
OneCricketeer

Reputation: 191681

I'm forwarding port 9042 to 8988

Then that's the port you need to connect to

.set("spark.cassandra.connection.port", 8988) 

Upvotes: 1

Related Questions