heathensoul
heathensoul

Reputation: 311

Connecting to cassandra using pyspark

I am a beginner learning to work with spark and cassandra. I am trying to connect to cassandra using pyspark. I am running cassandra 2.1 and spark 1.3.

I have cloned this repo https://github.com/TargetHolding/pyspark-cassandra and followed instructions to get it working with spark shell as well as with spark-submit.

This is the command I am using ./bin/spark-submit --packages pyspark-cassandra:1.3 --conf spark.cassandra.connection.host=127.0.0.1:9042 cassandra_test.py

and similarly with pyspark replacing spark-submit (without the script in the end)

I am getting this error: Exception in thread "main" java.lang.IllegalArgumentException: requirement failed: Provided Maven Coordinates must be in the form 'groupId:artifactId:version'. The coordinate provided is: pyspark-cassandra:1.3

I have tried to look for this error and go through related questions, but not able to get the connector working.

Any help will be greatly appreciated. Thanks in advance.

Upvotes: 2

Views: 890

Answers (1)

ashic
ashic

Reputation: 6495

Haven't tried it, but the spark packages page is here: http://spark-packages.org/package/TargetHolding/pyspark-cassandra

Seems to suggest:

$SPARK_HOME/bin/spark-shell --packages TargetHolding:pyspark-cassandra:0.1.5

Note the TargetHolding: bit. That might be it.

Upvotes: 1

Related Questions