HackCode
HackCode

Reputation: 1857

Error while connecting spark and cassandra

What I'm doing:

What steps have I followed:

cassandra-driver-core2.1.1.jar and spark-cassandra-connector_2.11-1.4.1.jar

Added the jar file paths to conf/spark-defaults.conf like

spark.driver.extraClassPath \
                            ~/path/to/spark-cassandra-connector_2.11-1.4.1.jar:\
                            ~/path/to/cassandra-driver-core-2.1.1.jar

How am I running the shell:

After running ./bin/cassandra, I run spark like-

sudo ./bin/pyspark

and also tried with sudo ./bin/spark-shell

What query am I making

sqlContext.read.format("org.apache.spark.sql.cassandra")\
               .options(table="users", keyspace="test")\
               .load()\
               .show()

The problem:

 java.lang.NoSuchMethodError:\
                    scala.Predef$.$conforms()Lscala/Predef$$less$colon$less;

But org.apache.spark.sql.cassandra is present in the spark-cassandra-connecter.jar that I downloaded.

Here is the full Log Trace

What have I tried:

Questions I've been thinking about-

  1. Are the versions of cassandra, spark and scala that I'm using compatible with each other?
  2. Am I using the correct version of the jar files?
  3. Did I compile spark in the wrong way?
  4. Am I missing something or doing something wrong?

I'm really new to spark and cassandra so I really need some advice! Been spending hours on this and probably it's something trivial.

Upvotes: 2

Views: 938

Answers (1)

RussS
RussS

Reputation: 16576

A few notes

One you are building spark for 2.10 and using Spark Cassandra Connector libraries for 2.11. To build spark for 2.11 you need to use the -Dscala-2.11 flag. This is most likely the main cause of your errors.

Next to actually include the connector in your project just including the core libs without the dependencies will not be enough. If you got past the first error you would most likely see other class not found errors from the missing deps.

This is why it's recommended to use the Spark Packages website and --packages flag. This will include a "fat-jar" which has all the required dependencies. See http://spark-packages.org/package/datastax/spark-cassandra-connector

For Spark 1.4.1 and pyspark this would be

//Scala 2.10
$SPARK_HOME/bin/pyspark --packages datastax:spark-cassandra-connector:1.4.1-s_2.10
//Scala 2.11
$SPARK_HOME/bin/pyspark --packages datastax:spark-cassandra-connector:1.4.1-s_2.11

You should never have to manually download jars using the --packages method.

Do not use spark.driver.extraClassPath , it will only add the dependencies to the driver remote code will not be able to use the dependencies.

Upvotes: 4

Related Questions