conker84
conker84

Reputation: 45

Apache Spark: NoSuchMethodError

i'm trying to do some tests on Apache Spark (v1.3.0), i have a simple Java 8 class:

public class WordCount {
    private JavaSparkContext ctx;
    private String inputFile, outputFile;
    public WordCount(String inputFile, String outputFile) {
        this.inputFile = inputFile;
        this.outputFile = outputFile;
        // Initialize Spark Conf
        ctx = new JavaSparkContext("local", "WordCount",
                System.getenv("SPARK_HOME"), System.getenv("JARS"));

    }

    public static void main(String... args) {
        String inputFile = "/home/workspace/spark/src/main/resources/inferno.txt";//args[0];
        String outputFile = "/home/workspace/spark/src/main/resources/dv";//args[1];
        WordCount wc = new WordCount(inputFile, outputFile);
        wc.doWordCount();
        wc.close();
    }

    public void doWordCount() {
        long start = System.currentTimeMillis();
        JavaRDD<String> inputRdd = ctx.textFile(inputFile);
        JavaPairRDD<String, Integer> count = inputRdd.flatMapToPair((String s) -> {
            List<Tuple2<String, Integer>> list = new ArrayList<>();
            Arrays.asList(s.split(" ")).forEach(s1 -> list.add(new Tuple2<String, Integer>(s1, 1)));
            return list;
        }).reduceByKey((x, y) -> x + y);
        Tuple2<String, Integer> max = count.max(new Tuple2Comparator());
        System.out.println(max);
//      count.saveAsTextFile(outputFile);
        long end = System.currentTimeMillis();
        System.out.println(String.format("Time in ms is: %d", end - start));
    }

    public void close() {
        ctx.stop();
    }

}

The Tuple2Comparator comparator class is:

public class Tuple2Comparator implements Comparator<Tuple2<String, Integer>>, Serializable {
    private static final long serialVersionUID = 103955884403243585L;

    @Override
    public int compare(Tuple2<String, Integer> o1, Tuple2<String, Integer> o2) {
        return o2._2() - o1._2();
    }

}

When i run it get the following exception:

Exception in thread "main" java.lang.NoSuchMethodError: org.apache.spark.api.java.JavaPairRDD.max(Ljava/util/Comparator;)Lscala/Tuple2;
        at it.conker.spark.base.WordCount.doWordCount2(WordCount.java:69)
        at it.conker.spark.base.WordCount.main(WordCount.java:41)

This is mi pom file:

<project xmlns="http://maven.apache.org/POM/4.0.0" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
xsi:schemaLocation="http://maven.apache.org/POM/4.0.0 http://maven.apache.org/xsd/maven-4.0.0.xsd">
    <groupId>it.conker.spark</groupId>
    <artifactId>learning-spark-by-example</artifactId>
    <modelVersion>4.0.0</modelVersion>
    <name>Learning Spark by example</name>
    <packaging>jar</packaging>
    <version>0.0.1</version>
    <dependencies>
        <dependency> <!-- Spark dependency -->
            <groupId>org.apache.spark</groupId>
            <artifactId>spark-core_2.10</artifactId>
            <version>1.3.0</version>
            <scope>provided</scope>
        </dependency>

        <dependency>
            <groupId>junit</groupId>
            <artifactId>junit</artifactId>
            <version>4.11</version>
            <scope>test</scope>
        </dependency>
    </dependencies>
    <properties>
        <java.version>1.8</java.version>
    </properties>
    <build>
        <pluginManagement>
            <plugins>
                <plugin>
                    <groupId>org.apache.maven.plugins</groupId>
                    <artifactId>maven-compiler-plugin</artifactId>
                    <version>3.1</version>
                    <configuration>
                        <source>${java.version}</source>
                        <target>${java.version}</target>
                    </configuration>
                </plugin>
            </plugins>
        </pluginManagement>
    </build>
</project>

I run the class within eclipse. Can someone tell me where i was wrong?

Upvotes: 3

Views: 1204

Answers (2)

Sandeep Purohit
Sandeep Purohit

Reputation: 3702

You should change the scope of dependency of spark make it compile scope and it will work fine f you try to run on local machine.

Upvotes: 0

Josh Rosen
Josh Rosen

Reputation: 13841

I believe that this is an occurrence of SPARK-3266, which will be fixed in all upcoming Spark maintenance releases (see my pull request).

One workaround, which I have not tested, would be to cast count to a JavaRDDLike<Tuple2<String, Integer>, ?> before calling max(), since a similar workaround worked for someone else (see my comment on JIRA).

Upvotes: 4

Related Questions