Reputation: 311
I am a beginner learning to work with spark and cassandra. I am trying to connect to cassandra using pyspark. I am running cassandra 2.1 and spark 1.3.
I have cloned this repo https://github.com/TargetHolding/pyspark-cassandra and followed instructions to get it working with spark shell as well as with spark-submit.
This is the command I am using ./bin/spark-submit --packages pyspark-cassandra:1.3 --conf spark.cassandra.connection.host=127.0.0.1:9042 cassandra_test.py
and similarly with pyspark replacing spark-submit (without the script in the end)
I am getting this error: Exception in thread "main" java.lang.IllegalArgumentException: requirement failed: Provided Maven Coordinates must be in the form 'groupId:artifactId:version'. The coordinate provided is: pyspark-cassandra:1.3
I have tried to look for this error and go through related questions, but not able to get the connector working.
Any help will be greatly appreciated. Thanks in advance.
Upvotes: 2
Views: 890
Reputation: 6495
Haven't tried it, but the spark packages page is here: http://spark-packages.org/package/TargetHolding/pyspark-cassandra
Seems to suggest:
$SPARK_HOME/bin/spark-shell --packages TargetHolding:pyspark-cassandra:0.1.5
Note the TargetHolding: bit. That might be it.
Upvotes: 1