Reputation: 477
My goal is to run a simple MapReduce job on a Cloudera cluster that reads from a dummy HBase database and writes out in a HDFS file.
Some important notes: - I've successfully run MapReduce jobs that took a HDFS file as input and wrote to HDFS file as output on this cluster before. - I've already replaced the libraries which are used for compiling the project from "purely" HBase to HBase-cloudera jars - When I previously encountered this kind of issues, I used to simply copy a lib into a distributed cache (worked for me with Google Guice): JobConf conf = new JobConf(getConf(), ParseJobConfig.class); DistributedCache.addCacheFile(new URI("/user/hduser/lib/3.0/guice-multibindings-3.0.jar"), conf); but now it doesn't work because the HBaseConfiguration class is used to create a configuration (before the configuration exists) - Cloudera version is 5.3.1, Hadoop version is 2.5.0
This is my driver code:
public class HbaseJobDriver {
public static void main(String[] args) throws Exception {
Configuration conf = HBaseConfiguration.create();
Job job = new Job(conf, "ExampleSummaryToFile");
job.setJarByClass(HbaseJobDriver.class);
Scan scan = new Scan();
scan.setCaching(500);
scan.setCacheBlocks(false);
TableMapReduceUtil.initTableMapperJob("Metrics",
scan,
HbaseJobMapper.class,
Text.class,
IntWritable.class,
job);
job.setReducerClass(HbaseJobReducer.class);
job.setNumReduceTasks(1);
FileOutputFormat.setOutputPath(job, new Path(args[0]));
}
}
I am not sure if mapper/reducer classes are needed to solve this issue.
The exception that I am getting is: Exception in thread "main" java.lang.NoClassDefFoundError: org/apache/hadoop/hbase/HBaseConfiguration
Upvotes: 0
Views: 1240
Reputation: 477
We've just solved it together with my colleague, in our case we needed to update .bashrc file:
nano ~/.bashrc
HBASE_PATH=/opt/cloudera/parcels/CDH/jars
export HADOOP_CLASSPATH=${HBASE_PATH}/hbase-common-0.98.6-cdh5.3.1.jar:<ANY_OTHER_JARS_REQUIRED>
. .bashrc
Upvotes: 1
Reputation: 2574
The error Exception in thread "main" java.lang.NoClassDefFoundError: org/apache/hadoop/hbase/HBaseConfiguration
is due to lack of HBase jar.
If what @sravan said did not work, then try to import HBaseConfiguration in your driver code (import section) like this:
import org.apache.hadoop.hbase.HBaseConfiguration;
Upvotes: 0
Reputation: 1082
Try this.
export HADOOP_CLASSPATH="/usr/lib/hbase/hbase.jar:$HADOOP_CLASSPATH"
add the above property in your /etc/hadoop/conf/hadoop-env.sh file or set it from command line
Upvotes: 0