Reputation: 184
I am trying to run the spark job which basically loads data in cassandra table. but it is giving following error.
java.io.InvalidClassException: com.datastax.spark.connector.rdd.CassandraRDD; local class incompatible: stream classdesc serialVersionUID = 8438672869923844256, local class serialVersionUID = -8421059671969407806
I have three spark nodes and I used following script to run it.
spark-submit --class test.testchannels --master spark://ubuntu:7077 --deploy-mode client --jars /home/user/BigData/jars/spark-cassandra-connector_2.11-2.0.0-RC1.jar,/home/user/BigData/jars/spark-cassandra-connector-java_2.10-1.6.0-M1.jar /home/user/BigData/SparkJobs/testchannelsparksubmit.jar /home/user/Data/channel_30Jun2017.csv
Also I have copied the cassandra related jars on worker nodes also on the same path.
Upvotes: 1
Views: 247
Reputation: 16576
You have mixed versions of the SCC in your cluster. The jars locally have one definition of CassandraRDD and the remote ones have a different version. It is highly recommended that you do not copy jars into your spark worker directories as it is very easy to make this sort of mistake. It's much simpler to use the --packages
command and allow spark to distributed your resources.
/home/user/BigData/jars/spark-cassandra-connector_2.11-2.0.0-RC1.jar,/home/user/BigData/jars/spark-cassandra-connector-java_2.10-1.6.0-M1.jar
Is most likely the culprit since you are not only combining 2 different versions of the connector, they are also two different versions of Spark. After 1.6.0 all of the "java" modules were merged into the core module so there is no need for the -java artifact. In addition, RC1 is not the released version of the connector (Release Candidate 1), you should be using 2.0.2 which is the latest release as of this post.
Upvotes: 2