shubham kumar
shubham kumar

Reputation: 13

Error in Spark cassandra integration with spark-cassandra connector

I am trying to save data in cassandra in a standalone mode from spark. By running following command:

  bin/spark-submit --packages datastax:spark-cassandra-connector:1.6.0-s_2.10
     --class "pl.japila.spark.SparkMeApp" --master local  /home/hduser2/code14/target/scala-2.10/simple-project_2.10-1.0.jar

My build.sbt file is :-

**name := "Simple Project"    
version := "1.0"     
scalaVersion := "2.10.4"      
libraryDependencies += "org.apache.spark" %% "spark-core" % "1.6.0"     
libraryDependencies += "org.apache.spark" %% "spark-sql" % "1.6.0"     
resolvers += "Spark Packages Repo" at "https://dl.bintray.com/spark-packages/maven"       
libraryDependencies += "datastax" % "spark-cassandra-connector" % "1.6.0-s_2.10"       
libraryDependencies ++= Seq(        
  "org.apache.cassandra" % "cassandra-thrift" % "3.5" ,      
  "org.apache.cassandra" % "cassandra-clientutil" % "3.5",      
  "com.datastax.cassandra" % "cassandra-driver-core" % "3.0.0"    
)**

My Spark code is :-

package pl.japila.spark     
import org.apache.spark.sql._    
import com.datastax.spark.connector._    
import com.datastax.driver.core._     
import com.datastax.spark.connector.cql._   
import org.apache.spark.{SparkContext, SparkConf}    
import com.datastax.driver.core.QueryOptions._    
import org.apache.spark.SparkConf    
import com.datastax.driver.core._   
import com.datastax.spark.connector.rdd._   

object SparkMeApp {
  def main(args: Array[String]) {

 val conf = new SparkConf(true).set("spark.cassandra.connection.host", "127.0.0.1")

  val sc = new SparkContext("local", "test", conf)    
  val sqlContext = new org.apache.spark.sql.SQLContext(sc)    
  val rdd = sc.cassandraTable("test", "kv")    
  val collection = sc.parallelize(Seq(("cat", 30), ("fox", 40)))

collection.saveToCassandra("test", "kv", SomeColumns("key", "value"))
  }
}

And I got this error:-

Exception in thread "main" java.lang.NoSuchMethodError: com.datastax.driver.core.QueryOptions.setRefreshNodeIntervalMillis(I)Lcom/datastax/driver/core/QueryOptions;** at com.datastax.spark.connector.cql.DefaultConnectionFactory$.clusterBuilder(CassandraConnectionFactory.scala:49) at com.datastax.spark.connector.cql.DefaultConnectionFactory$.createCluster(CassandraConnectionFactory.scala:92) at com.datastax.spark.connector.cql.CassandraConnector$.com$datastax$spark$connector$cql$CassandraConnector$$createSession(CassandraConnector.scala:153) at com.datastax.spark.connector.cql.CassandraConnector$$anonfun$3.apply(CassandraConnector.scala:148) at com.datastax.spark.connector.cql.CassandraConnector$$anonfun$3.apply(CassandraConnector.scala:148) at com.datastax.spark.connector.cql.RefCountedCache.createNewValueAndKeys(RefCountedCache.scala:31) at com.datastax.spark.connector.cql.RefCountedCache.acquire(RefCountedCache.scala:56) at com.datastax.spark.connector.cql.CassandraConnector.openSession(CassandraConnector.scala:81) at com.datastax.spark.connector.cql.CassandraConnector.withSessionDo(CassandraConnector.scala:109)

Versions used are :-
Spark - 1.6.0
Scala - 2.10.4
cassandra-driver-core jar - 3.0.0
cassandra version 2.2.7
spark-cassandra connector - 1.6.0-s_2.10

SOMEBODY PLEASE HELP !!

Upvotes: 1

Views: 1469

Answers (1)

RussS
RussS

Reputation: 16576

I would start by removing

libraryDependencies ++= Seq(        
  "org.apache.cassandra" % "cassandra-thrift" % "3.5" ,      
  "org.apache.cassandra" % "cassandra-clientutil" % "3.5",      
  "com.datastax.cassandra" % "cassandra-driver-core" % "3.0.0"    
)

Since the libraries which are dependencies of the connector will be included automatically with the packages dependency.

Then I would test the packages resolution by launching the spark-shell with

./bin/spark-shell --packages datastax:spark-cassandra-connector:1.6.0-s_2.10

you see the following resolutions happening correctly

datastax#spark-cassandra-connector added as a dependency
:: resolving dependencies :: org.apache.spark#spark-submit-parent;1.0
        confs: [default]
        found datastax#spark-cassandra-connector;1.6.0-s_2.10 in spark-packages
        found org.apache.cassandra#cassandra-clientutil;3.0.2 in list
        found com.datastax.cassandra#cassandra-driver-core;3.0.0 in list
        ...
        [2.10.5] org.scala-lang#scala-reflect;2.10.5
:: resolution report :: resolve 627ms :: artifacts dl 10ms
        :: modules in use:
        com.datastax.cassandra#cassandra-driver-core;3.0.0 from list in [default]
        com.google.guava#guava;16.0.1 from list in [default]
        com.twitter#jsr166e;1.1.0 from list in [default]
        datastax#spark-cassandra-connector;1.6.0-s_2.10 from spark-packages in [default]
        ...

If these appear to resolve correctly but everything still doesn't work, I would try clearing out the cache for these artifacts.

Upvotes: 2

Related Questions