I am using CDH5.3 and I am trying to write a mapreduce program to scan a table and do some proccessing. I have created a mapper which extends TableMapper and exception that i am getting is : File does not exist: hdfs://localhost:54310/usr/local/hadoop-2.5-cdh-3.0/share/hadoop/common/lib/protobuf-java-2.5.0.jar
at org.apache.hadoop.hdfs.DistributedFileSystem$17.doCall(
at org.apache.hadoop.fs.FileSystemLinkResolver.resolve(
at org.apache.hadoop.hdfs.DistributedFileSystem.getFileStatus(
at org.apache.hadoop.mapreduce.filecache.ClientDistributedCacheManager.getFileStatus(
at org.apache.hadoop.mapreduce.filecache.ClientDistributedCacheManager.determineTimestamps(
at org.apache.hadoop.mapreduce.filecache.ClientDistributedCacheManager.determineTimestampsAndCacheVisibilities(
at org.apache.hadoop.mapreduce.JobSubmitter.copyAndConfigureFiles(
but as you can note here it is searching for protobuf-java-2.5.0.jar in the hdfs path but actually it is present in the local path - /usr/local/hadoop-2.5-cdh-3.0/share/hadoop/common/lib/protobuf-java-2.5.0.jar , i verified . This is not happening with normal mapreduce programs . only when i am using TableMapper this error happens .
My driver code is as following :
public class AppDriver {
public static void main(String[] args) throws Exception{
Configuration hbaseConfig = HBaseConfiguration.create();
hbaseConfig.set("hbase.zookeeper.quorum", PropertiesUtil.getZookeperHostName());
hbaseConfig.set("", PropertiesUtil.getZookeperPortNum());
Job job = Job.getInstance(hbaseConfig, "hbasemapreducejob");
job.setJarByClass( AppDriver.class );
// Create a scan
Scan scan = new Scan();
scan.setCaching(500); // 1 is the default in Scan, which will be bad for MapReduce jobs
scan.setCacheBlocks(false); // don't set to true for MR jobs
// scan.setStartRow(Bytes.toBytes(PropertiesUtil.getHbaseStartRowkey()));
// scan.setStopRow(Bytes.toBytes(PropertiesUtil.getHbaseStopRowkey()));
TableMapReduceUtil.initTableMapperJob(PropertiesUtil.getHbaseTableName(),scan, ESportMapper.class, Text.class, RecordStatusVO.class, job);
job.setReducerClass( ESportReducer.class );
// Write the results to a file in the output directory
FileOutputFormat.setOutputPath( job, new Path( args[1] ));
boolean b = job.waitForCompletion(true);
if (!b) {
throw new IOException("error with job!");
I am taking properties file as args[0] .
some more underline info :
i am using standalone CDH 5.3 in my local system and hbase 0.98.6 . my hbase is running on top of hdfs in sudo distributed mode .
my is as following :
apply plugin: 'java'
apply plugin: 'eclipse'
apply plugin: 'application'
// Basic Properties
sourceCompatibility = 1.7
targetCompatibility = '1.7'
version = '3.0'
mainClassName ="com.ESport.mapreduce.App.AppDriver"
jar {
manifest {
attributes "Main-Class": "$mainClassName"
from {
configurations.compile.collect { it.isDirectory() ? it : zipTree(it) }
zip64 true
repositories {
maven { url "" }
maven { url " repos/" }
dependencies {
testCompile group: 'junit', name: 'junit', version: '4.+'
compile group: 'commons-collections', name: 'commons-collections', version: '3.2'
compile 'org.apache.storm:storm-core:0.9.4'
compile 'org.apache.commons:commons-compress:1.5'
compile 'org.elasticsearch:elasticsearch:1.7.1'
exclude group: 'org.slf4j'
compile('org.apache.hbase:hbase-client:0.98.6-cdh5.3.0') {
exclude group: 'org.slf4j'
exclude group: 'org.jruby'
exclude group: 'jruby-complete'
exclude group: 'org.codehaus.jackson'
compile 'org.apache.hbase:hbase-common:0.98.6-cdh5.3.0'
compile 'org.apache.hbase:hbase-server:0.98.6-cdh5.3.0'
compile 'org.apache.hbase:hbase-protocol:0.98.6-cdh5.3.0'
exclude group: 'org.slf4j'
exclude group: 'org.apache.hbase'
exclude group: 'org.slf4j'
exclude group: 'org.slf4j'
compile 'org.perf4j:perf4j:0.9.16'
compile 'com.fasterxml.jackson.core:jackson-core:2.5.3'
compile 'com.fasterxml.jackson.core:jackson-databind:2.5.3'
compile 'com.fasterxml.jackson.core:jackson-annotations:2.5.3'
compile 'com.fasterxml.jackson.dataformat:jackson-dataformat-yaml:2.1.2'
and i am using this command to run the jar :
hadoop jar ESportingMapReduce-3.0.jar /myoutput
Reputation: 21
If you are trying to setup in hbase in pseudo distributed mode, most probable reason for this adding hadoop home to $PATH
By just removing hadoop home from $PATH
you can start hbase in pseudo distributed mode.
Some people by default add hadoop home in .bashrc
If you are added it in .bashrc remove hadoop home from it.
Upvotes: 2