Dinosaurius
Dinosaurius

Reputation: 8628

Caused by: java.lang.ClassNotFoundException: org.jets3t.service.ServiceException

My code should access some file stored on S3 (this code works fine on one machine, while it fails on the other one; basically it fails when it gets executed locally (not on the cluster) from Intellij IDEA):

sc.hadoopConfiguration.set("fs.s3n.impl", "org.apache.hadoop.fs.s3native.NativeS3FileSystem")
sc.hadoopConfiguration.set("fs.s3n.awsAccessKeyId", "xxx")
sc.hadoopConfiguration.set("fs.s3n.awsSecretAccessKey", "xxx")

val sqlContext = new SQLContext(sc)

var df = sqlContext.read.json("s3n://myPath/*.json")

I get the following error at the line var df = sqlContext.read.json("s3n://myPath/*.json"):

Caused by: java.lang.ClassNotFoundException: org.jets3t.service.ServiceException
    at java.net.URLClassLoader.findClass(URLClassLoader.java:381)
    at java.lang.ClassLoader.loadClass(ClassLoader.java:424)
    at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:331)
    at java.lang.ClassLoader.loadClass(ClassLoader.java:357)

I read similar threads regarding this issue and it was mentioned that in case of using Spark 1.6.2, the solution is to use org.apache.hadoop hadoop-aws 2.6.0. In my case it has not solved the problem.

My pom.xml (an extract from it):

<properties>
        <project.build.sourceEncoding>UTF-8</project.build.sourceEncoding>
        <project.reporting.outputEncoding>UTF-8</project.reporting.outputEncoding>

        <java.version>1.8</java.version>
        <scala.version>2.10.6</scala.version>
        <spark.version>1.6.2</spark.version>
        <jackson.version>2.8.3</jackson.version>
    </properties>

<dependencies>
        <dependency>
            <groupId>org.scala-lang</groupId>
            <artifactId>scala-library</artifactId>
            <version>${scala.version}</version>
        </dependency>
        <dependency>
            <groupId>org.apache.spark</groupId>
            <artifactId>spark-streaming_2.10</artifactId>
            <!--<scope>provided</scope>-->
            <version>${spark.version}</version>
        </dependency>
        <dependency>
            <groupId>org.apache.spark</groupId>
            <artifactId>spark-streaming-kafka_2.10</artifactId>
            <version>${spark.version}</version>
        </dependency>
        <dependency>
            <groupId>org.apache.spark</groupId>
            <artifactId>spark-sql_2.10</artifactId>
            <!--<scope>provided</scope>-->
            <version>${spark.version}</version>
        </dependency>
        <dependency>
            <groupId>org.apache.spark</groupId>
            <artifactId>spark-mllib_2.10</artifactId>
            <version>${spark.version}</version>
        </dependency>
        <dependency>
            <groupId>com.fasterxml.jackson.module</groupId>
            <artifactId>jackson-module-scala_2.10</artifactId>
            <version>${jackson.version}</version>
        </dependency>
        <dependency>
            <groupId>com.fasterxml.jackson.core</groupId>
            <artifactId>jackson-databind</artifactId>
            <version>${jackson.version}</version>
        </dependency>
        <dependency>
            <groupId>com.fasterxml.jackson.core</groupId>
            <artifactId>jackson-annotations</artifactId>
            <version>${jackson.version}</version>
        </dependency>
        <dependency>
            <groupId>com.fasterxml.jackson.core</groupId>
            <artifactId>jackson-core</artifactId>
            <version>${jackson.version}</version>
        </dependency>
        <dependency>
            <groupId>com.lambdaworks</groupId>
            <artifactId>jacks_2.10</artifactId>
            <version>2.3.3</version>
        </dependency>
        <dependency>
            <groupId>com.typesafe</groupId>
            <artifactId>config</artifactId>
            <version>1.3.1</version>
        </dependency>
        <dependency>
            <groupId>org.apache.hadoop</groupId>
            <artifactId>hadoop-aws</artifactId>
            <version>2.6.0</version>
        </dependency>
        <dependency>
            <groupId>com.amazonaws</groupId>
            <artifactId>aws-java-sdk-s3</artifactId>
            <version>1.11.53</version>
        </dependency>
        <dependency>
            <groupId>net.debasishg</groupId>
            <artifactId>redisclient_2.10</artifactId>
            <version>3.3</version>
        </dependency>
    </dependencies>

Upvotes: 5

Views: 9295

Answers (1)

Ramesh Maharjan
Ramesh Maharjan

Reputation: 41957

Adding the following in the dependency should solve the issue

<dependency>
    <groupId>org.apache.hadoop</groupId>
    <artifactId>hadoop-client</artifactId>
    <version>2.6.0</version>
</dependency>

I hope this helps

Upvotes: 6

Related Questions