eshant batra
eshant batra

Reputation: 1

How to add External Jar in hadoop Environment?

I am facing some issues while running the map reduce job.
I have used JSON jar to process the JSON file that is in the HDFS and I wrote the logic, but while running the job getting error (not able to find class means ClassNotFoundException).
I don't know how HADOOP able to detect my jar where its is placed.
Where to set the jar path and in which file?
Can anyone solve my problem?

Upvotes: 0

Views: 2004

Answers (3)

Gyanendra Dwivedi
Gyanendra Dwivedi

Reputation: 5538

Assuming that your project is a maven project, just create a fat jar with all the dependent jars. See here:

https://www.mkyong.com/maven/create-a-fat-jar-file-maven-assembly-plugin/

And the dependent jars will be available to the classpath. Another option is to add jar to Distributed cache in the driver class:

DistributedCache.addFileToClassPath(yourJar, conf);

Another option is to set the jar in hadoop classpath.

export HADOOP_CLASSPATH=3rd_party_jar

Upvotes: 0

mike
mike

Reputation: 373

Set HADOOP_CLASSPATH environment variable

export HADOOP_CLASSPATH=$HADOOP_CLASSPATH:<DEPENDENT_JARS_USED_BY_ CLIENT_CLASS>

Use -libjars option when submitting job

hadoop jar example.jar com.example.Tool -libjars mysql-connector-java.jar,abc.jar

Upvotes: 1

OneCricketeer
OneCricketeer

Reputation: 191728

Actually i don't know how HADOOP able to detect my jar where its is placed

It reads from the classpath of the YARN containers.

The easiest way for the library to get added is if you "shade" (using Maven) whatever JSON library you need into your MapReduce program's JAR file.


The (arguably) easier way to process JSON would be using Spark, Drill, or Hive

Upvotes: 0

Related Questions