Reputation: 1175
I'm using spark(spark version 1.2.1, scala version: 2.10.4) with cassandra (cassandra connector 1.2.0-rc3), and i want to use the joinWithCassandraTable
function. For this, i've tried it in spark-shell and it works perfectly.
val customersInteractions= customers.joinWithCassandraTable(cassandraKeyspace, table).on(SomeColumns("c1","c2")).select("cl1", "cl2", "cl3","cl4","cl5").
But now i want to use it in a maven project. So, in my IntelliJ i used these maven dependencies :
<?xml version="1.0" encoding="UTF-8"?>
http://maven.apache.org/xsd/maven-4.0.0.xsd"> aid-cim fr.aid.cim 0.9-SNAPSHOT 4.0.0
<artifactId>spark-cassandra</artifactId>
<properties>
<spark.version>1.2.1</spark.version>
<scala.version>2.11.0</scala.version>
</properties>
<repositories>
<repository>
<id>spark-jobserver</id>
<name>spark-jobserver</name>
<url>https://dl.bintray.com/spark-jobserver/maven</url>
</repository>
</repositories>
<pluginRepositories>
<pluginRepository>
<id>scala-tools.org</id>
<name>Scala-tools Maven2 Repository</name>
<url>http://scala-tools.org/repo-releases</url>
</pluginRepository>
</pluginRepositories>
<build>
<plugins>
<plugin>
<groupId>org.scala-tools</groupId>
<artifactId>maven-scala-plugin</artifactId>
<executions>
<execution>
<goals>
<goal>compile</goal>
<goal>testCompile</goal>
</goals>
</execution>
</executions>
</plugin>
<plugin>
<artifactId>maven-assembly-plugin</artifactId>
<configuration>
<descriptorRefs>
<descriptorRef>jar-with-dependencies</descriptorRef>
</descriptorRefs>
</configuration>
<executions>
<execution>
<id>make-assembly</id>
<phase>package</phase>
<goals>
<goal>single</goal>
</goals>
</execution>
</executions>
</plugin>
<plugin>
<groupId>org.apache.maven.plugins</groupId>
<artifactId>maven-dependency-plugin</artifactId>
</plugin>
</plugins>
</build>
<dependencies>
<dependency>
<groupId>net.sf.jopt-simple</groupId>
<artifactId>jopt-simple</artifactId>
<version>4.8</version>
</dependency>
<dependency>
<groupId>com.fasterxml.jackson.core</groupId>
<artifactId>jackson-databind</artifactId>
</dependency>
<!-- Scala -->
<dependency>
<groupId>org.scala-lang</groupId>
<artifactId>scala-library</artifactId>
<version>${scala.version}</version>
<scope>compile</scope>
</dependency>
<dependency>
<groupId>org.scala-lang</groupId>
<artifactId>scala-compiler</artifactId>
<version>${scala.version}</version>
</dependency>
<!-- END Scala -->
<dependency>
<groupId>spark.jobserver</groupId>
<artifactId>job-server-api</artifactId>
<version>0.4.1</version>
<scope>compile</scope>
</dependency>
<dependency>
<groupId>org.apache.httpcomponents</groupId>
<artifactId>httpclient</artifactId>
<version>4.3.6</version>
</dependency>
<dependency>
<groupId>com.googlecode.json-simple</groupId>
<artifactId>json-simple</artifactId>
<version>1.1</version>
</dependency>
<dependency>
<groupId>joda-time</groupId>
<artifactId>joda-time</artifactId>
<version>2.6</version>
</dependency>
<!-- START Logger -->
<dependency>
<groupId>ch.qos.logback</groupId>
<artifactId>logback-classic</artifactId>
</dependency>
<dependency>
<groupId>org.slf4j</groupId>
<artifactId>slf4j-api</artifactId>
</dependency>
<!-- END Logger -->
<!-- Tests -->
<dependency>
<groupId>junit</groupId>
<artifactId>junit</artifactId>
<scope>text</scope>
</dependency>
<dependency>
<groupId>org.hamcrest</groupId>
<artifactId>hamcrest-core</artifactId>
</dependency>
<dependency>
<groupId>info.cukes</groupId>
<artifactId>cucumber-picocontainer</artifactId>
<version>1.2.0</version>
<scope>test</scope>
</dependency>
<dependency>
<groupId>info.cukes</groupId>
<artifactId>cucumber-core</artifactId>
<scope>test</scope>
</dependency>
<dependency>
<groupId>info.cukes</groupId>
<artifactId>cucumber-junit</artifactId>
<scope>test</scope>
</dependency>
<dependency>
<groupId>org.cassandraunit</groupId>
<artifactId>cassandra-unit</artifactId>
<version>2.1.3.1</version>
<exclusions>
<exclusion>
<artifactId>slf4j-log4j12</artifactId>
<groupId>org.slf4j</groupId>
</exclusion>
</exclusions>
<scope>test</scope>
</dependency>
<!-- END Tests -->
</dependencies>
<profiles>
<profile>
<id>local</id>
<activation>
<activeByDefault>true</activeByDefault>
</activation>
<dependencies>
<!-- Apache Spark -->
<dependency>
<groupId>org.apache.spark</groupId>
<artifactId>spark-core_2.10</artifactId>
<version>${spark.version}</version>
<scope>compile</scope>
</dependency>
<dependency>
<groupId>org.apache.spark</groupId>
<artifactId>spark-streaming_2.10</artifactId>
<version>${spark.version}</version>
<scope>compile</scope>
</dependency>
<!-- END Apache Spark -->
<!-- START Spark Cassandra Connector -->
<dependency>
<groupId>com.datastax.spark</groupId>
<artifactId>spark-cassandra-connector_2.10</artifactId>
<version>1.2.0-rc3</version>
<!--<version>${spark.version}</version> !-->
<exclusions>
<exclusion>
<groupId>org.apache.spark</groupId>
<artifactId>spark-streaming_2.10</artifactId>
</exclusion>
</exclusions>
</dependency>
<dependency>
<groupId>com.datastax.spark</groupId>
<artifactId>spark-cassandra-connector-java_2.10</artifactId>
<version>1.2.0-rc3</version>
<!-- <version>${spark.version}</version> !-->
<exclusions>
<exclusion>
<groupId>org.apache.spark</groupId>
<artifactId>spark-streaming_2.10</artifactId>
</exclusion>
</exclusions>
</dependency>
</dependencies>
</profile>
<profile>
<id>cluster</id>
<dependencies>
<!-- Apache Spark -->
<dependency>
<groupId>org.apache.spark</groupId>
<artifactId>spark-core_2.10</artifactId>
<version>${spark.version}</version>
<scope>provided</scope>
</dependency>
<dependency>
<groupId>org.apache.spark</groupId>
<artifactId>spark-streaming_2.10</artifactId>
<version>${spark.version}</version>
<scope>provided</scope>
</dependency>
<!-- END Apache Spark -->
<!-- START Spark Cassandra Connector -->
<dependency>
<groupId>com.datastax.spark</groupId>
<artifactId>spark-cassandra-connector_2.10</artifactId>
<!-- <version>${spark.version}</version> !-->
<version>1.2.0-rc3</version>
<exclusions>
<exclusion>
<groupId>org.apache.spark</groupId>
<artifactId>spark-streaming_2.10</artifactId>
</exclusion>
</exclusions>
<scope>provided</scope>
</dependency>
<dependency>
<groupId>com.datastax.spark</groupId>
<artifactId>spark-cassandra-connector-java_2.10</artifactId>
<!-- <version>${spark.version}</version> !-->
<version>1.2.0-rc3</version>
<exclusions>
<exclusion>
<groupId>org.apache.spark</groupId>
<artifactId>spark-streaming_2.10</artifactId>
</exclusion>
</exclusions>
<scope>provided</scope>
</dependency>
</dependencies>
</profile>
</profiles>
but when i'm trying to execute my program i got this error :
Exception in thread "main" java.lang.NoSuchMethodError: scala.reflect.api.JavaUniverse.runtimeMirror(Ljava/lang/ClassLoader;)Lscala/reflect/api/JavaUniverse$JavaMirror;
And the line of the error in my code : is my JoinWithCassandraTable
function.
Do i do some maven dependencies wrong ??? how can i fix this problem ? thanks in advance for your help.
Upvotes: 2
Views: 2660
Reputation: 2033
Spark 1.2.1 depends on scala version 2.10.4. You can check the dependency version in Maven Repo https://mvnrepository.com/artifact/org.apache.spark
You have to change dependency
<properties>
<spark.version>1.2.1</spark.version>
<scala.version>2.11.0</scala.version>
</properties>
to
<properties>
<spark.version>1.2.1</spark.version>
<scala.version>2.10.4</scala.version>
</properties>
Upvotes: 3
Reputation: 5999
For submitting to a cluster, having the dependencies on maven is not enough. You usually have to assemble a fat jar that includes the dependencies and pass that to spark, so that it can make all code available on the executors.
Upvotes: 1