Shivkumar Mallesappa
Shivkumar Mallesappa

Reputation: 3077

Exception in thread "main" java.lang.NoClassDefFoundError: org/apache/hadoop/mapreduce/RecordReader

I am trying to convert my Json file to Parquet format.

Following is my pom file.

<?xml version="1.0" encoding="UTF-8"?>
<project xmlns="http://maven.apache.org/POM/4.0.0" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:schemaLocation="http://maven.apache.org/POM/4.0.0 http://maven.apache.org/xsd/maven-4.0.0.xsd">
    <modelVersion>4.0.0</modelVersion>
    <groupId>com.mypackage</groupId>
    <artifactId>JSONToParquet</artifactId>
    <version>1.0-SNAPSHOT</version>
    <packaging>jar</packaging>

    <repositories>
        <repository>
            <id>wso2</id>
            <url>http://dist.wso2.org/maven2/</url>
        </repository>
    </repositories>
    <dependencies>
        <dependency>
            <groupId>org.kitesdk</groupId>
            <artifactId>kite-data-core</artifactId>
            <version>1.1.0</version>
        </dependency>

        <dependency>
            <groupId>org.kitesdk</groupId>
            <artifactId>kite-morphlines-all</artifactId>
            <version>1.0.0</version> <!-- or whatever the latest version is -->
            <type>pom</type>
        </dependency>

        <!-- https://mvnrepository.com/artifact/ua_parser/ua-parser -->
        <dependency>
            <groupId>ua_parser</groupId>
            <artifactId>ua-parser</artifactId>
            <version>1.3.0</version>
            <type>pom</type>
        </dependency>

    </dependencies>

    <properties>
        <project.build.sourceEncoding>UTF-8</project.build.sourceEncoding>
        <maven.compiler.source>1.8</maven.compiler.source>
        <maven.compiler.target>1.8</maven.compiler.target>
    </properties>


</project>

Following is the code for conversion :

Schema jsonSchema = JsonUtil.inferSchema(inputstream, "Movie", 10);
try (JSONFileReader<Movie> reader = new JSONFileReader<>(
        inputstream, jsonSchema, Movie.class)) {

    reader.initialize();

    ParquetWriter parquetWriter
            = new AvroParquetWriter(outputPath, jsonSchema, compressionCodecName, ParquetWriter.DEFAULT_BLOCK_SIZE, ParquetWriter.DEFAULT_PAGE_SIZE);

    for (Movie record : reader) {
        parquetWriter.write(record);
    }

In the above code Movie is my POJO class.

When I run the program I am facing the following exception :

Exception in thread "main" java.lang.NoClassDefFoundError: org/apache/hadoop/mapreduce/RecordReader
    at com.mypackage.jsontoparquet.JsonToParquet.main(JsonToParquet.java:34)
Caused by: java.lang.ClassNotFoundException: org.apache.hadoop.mapreduce.RecordReader
    at java.net.URLClassLoader.findClass(URLClassLoader.java:381)
    at java.lang.ClassLoader.loadClass(ClassLoader.java:424)
    at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:338)
    at java.lang.ClassLoader.loadClass(ClassLoader.java:357)
    ... 1 more

I am using JDK : 8.

I don't have any background of hadoop, so I am unable to understand it's root cause.

What is the issue ?

Upvotes: 0

Views: 1814

Answers (2)

Ori Marko
Ori Marko

Reputation: 58832

Your kite is missing hadoop dependencies

there are some cases where you may have to provide the relevant Hadoop component dependencies yourself, and Kite has grouping dependencies for this purpose.

For Haddop2 (default) add to your pom:

 <dependency>
   <groupId>org.kitesdk</groupId>
   <artifactId>kite-hadoop2-dependencies</artifactId>
    <version>1.0.0</version>
   <type>pom</type>
   <scope>compile</scope>
 </dependency>

Upvotes: 0

Mobin Ranjbar
Mobin Ranjbar

Reputation: 1360

Based on Kite-SDK Documentation, JSONFileReader,ParquetWriter and AvroParquetWriter use Hadoop libraries to work. It is needed to add Hadoop dependencies in your pom. You need at least below dependencies. Add them in your pom.xml:

<dependency>
    <groupId>org.apache.hadoop</groupId>
    <artifactId>hadoop-mapreduce-client-core</artifactId>
    <version>2.6.0</version>
</dependency>
<dependency>
    <groupId>org.apache.hadoop</groupId>
    <artifactId>hadoop-common</artifactId>
    <version>2.6.0</version>
</dependency>
<dependency>
    <groupId>org.apache.hadoop</groupId>
    <artifactId>hadoop-mapreduce-client-jobclient</artifactId>
    <version>2.6.0</version>
</dependency>

Upvotes: 2

Related Questions