Reputation: 1
I am facing some issues while running the map reduce job.
I have used JSON jar to process the JSON file that is in the HDFS and I wrote the logic, but while running the job getting error (not able to find class means ClassNotFoundException).
I don't know how HADOOP able to detect my jar where its is placed.
Where to set the jar path and in which file?
Can anyone solve my problem?
Upvotes: 0
Views: 2004
Reputation: 5538
Assuming that your project is a maven project, just create a fat jar with all the dependent jars. See here:
https://www.mkyong.com/maven/create-a-fat-jar-file-maven-assembly-plugin/
And the dependent jars will be available to the classpath. Another option is to add jar to Distributed cache in the driver class:
DistributedCache.addFileToClassPath(yourJar, conf);
Another option is to set the jar in hadoop classpath.
export HADOOP_CLASSPATH=3rd_party_jar
Upvotes: 0
Reputation: 373
Set HADOOP_CLASSPATH environment variable
export HADOOP_CLASSPATH=$HADOOP_CLASSPATH:<DEPENDENT_JARS_USED_BY_ CLIENT_CLASS>
Use -libjars option when submitting job
hadoop jar example.jar com.example.Tool -libjars mysql-connector-java.jar,abc.jar
Upvotes: 1
Reputation: 191728
Actually i don't know how HADOOP able to detect my jar where its is placed
It reads from the classpath of the YARN containers.
The easiest way for the library to get added is if you "shade" (using Maven) whatever JSON library you need into your MapReduce program's JAR file.
The (arguably) easier way to process JSON would be using Spark, Drill, or Hive
Upvotes: 0