Reputation: 1857
What I'm doing:
What steps have I followed:
sudo build/mvn -Pyarn -Phadoop-2.4 -Dhadoop.version=2.4.0 -DskipTests clean packag
and sbt/sbt clean assembly
spark/lib
:cassandra-driver-core2.1.1.jar and spark-cassandra-connector_2.11-1.4.1.jar
Added the jar file paths to conf/spark-defaults.conf
like
spark.driver.extraClassPath \
~/path/to/spark-cassandra-connector_2.11-1.4.1.jar:\
~/path/to/cassandra-driver-core-2.1.1.jar
How am I running the shell:
After running ./bin/cassandra
, I run spark like-
sudo ./bin/pyspark
and also tried with sudo ./bin/spark-shell
What query am I making
sqlContext.read.format("org.apache.spark.sql.cassandra")\
.options(table="users", keyspace="test")\
.load()\
.show()
The problem:
java.lang.NoSuchMethodError:\
scala.Predef$.$conforms()Lscala/Predef$$less$colon$less;
But org.apache.spark.sql.cassandra
is present in the spark-cassandra-connecter.jar that I downloaded.
Here is the full Log Trace
What have I tried:
--packages
and --driver-class-path
and --jars
options by adding the 2 jars. Questions I've been thinking about-
I'm really new to spark and cassandra so I really need some advice! Been spending hours on this and probably it's something trivial.
Upvotes: 2
Views: 938
Reputation: 16576
A few notes
One you are building spark for 2.10 and using Spark Cassandra Connector libraries for 2.11. To build spark for 2.11 you need to use the -Dscala-2.11
flag. This is most likely the main cause of your errors.
Next to actually include the connector in your project just including the core libs without the dependencies will not be enough. If you got past the first error you would most likely see other class not found errors from the missing deps.
This is why it's recommended to use the Spark Packages website and --packages
flag. This will include a "fat-jar" which has all the required dependencies. See
http://spark-packages.org/package/datastax/spark-cassandra-connector
For Spark 1.4.1 and pyspark this would be
//Scala 2.10
$SPARK_HOME/bin/pyspark --packages datastax:spark-cassandra-connector:1.4.1-s_2.10
//Scala 2.11
$SPARK_HOME/bin/pyspark --packages datastax:spark-cassandra-connector:1.4.1-s_2.11
You should never have to manually download jars using the --packages
method.
Do not use spark.driver.extraClassPath , it will only add the dependencies to the driver remote code will not be able to use the dependencies.
Upvotes: 4