Radek Tomšej
Radek Tomšej

Reputation: 490

Import data from MapReduce to HBase (TableOutputFormat error)

A am trying to save data from MapReduce job into HBase. We made script which work great on older versions of Hadoop (CDH3u4). Now we upgraded to the newest version (CDH 5.0.2) and script is not working.

When I run the program on newest version, I get following error:

Exception in thread "main" java.lang.RuntimeException: java.io.IOException: java.lang.reflect.InvocationTargetException
        at org.apache.hadoop.hbase.mapreduce.TableOutputFormat.setConf(TableOutputFormat.java:211)
        at org.apache.hadoop.util.ReflectionUtils.setConf(ReflectionUtils.java:73)
        at org.apache.hadoop.util.ReflectionUtils.newInstance(ReflectionUtils.java:133)
        at org.apache.hadoop.mapreduce.JobSubmitter.checkSpecs(JobSubmitter.java:455)
        at org.apache.hadoop.mapreduce.JobSubmitter.submitJobInternal(JobSubmitter.java:343)
        at org.apache.hadoop.mapreduce.Job$10.run(Job.java:1295)
        at org.apache.hadoop.mapreduce.Job$10.run(Job.java:1292)
        at java.security.AccessController.doPrivileged(Native Method)
        at javax.security.auth.Subject.doAs(Subject.java:415)
        at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1548)
        at org.apache.hadoop.mapreduce.Job.submit(Job.java:1292)
        at org.apache.hadoop.mapreduce.Job.waitForCompletion(Job.java:1313)
        at com.nrholding.t0_mr.main.ELogHBaseImport.main(ELogHBaseImport.java:89)
        at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
        at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
        at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
        at java.lang.reflect.Method.invoke(Method.java:606)
        at org.apache.hadoop.util.RunJar.main(RunJar.java:212)
Caused by: java.io.IOException: java.lang.reflect.InvocationTargetException
        at org.apache.hadoop.hbase.client.HConnectionManager.createConnection(HConnectionManager.java:389)
        at org.apache.hadoop.hbase.client.HConnectionManager.createConnection(HConnectionManager.java:366)
        at org.apache.hadoop.hbase.client.HConnectionManager.getConnection(HConnectionManager.java:247)
        at org.apache.hadoop.hbase.client.HTable.<init>(HTable.java:188)
        at org.apache.hadoop.hbase.client.HTable.<init>(HTable.java:150)
        at org.apache.hadoop.hbase.mapreduce.TableOutputFormat.setConf(TableOutputFormat.java:206)
        ... 17 more
Caused by: java.lang.reflect.InvocationTargetException
        at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
        at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:57)
        at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
        at java.lang.reflect.Constructor.newInstance(Constructor.java:526)
        at org.apache.hadoop.hbase.client.HConnectionManager.createConnection(HConnectionManager.java:387)
        ... 22 more
Caused by: java.lang.NoClassDefFoundError: org/cloudera/htrace/Trace
        at org.apache.hadoop.hbase.zookeeper.RecoverableZooKeeper.exists(RecoverableZooKeeper.java:195)
        at org.apache.hadoop.hbase.zookeeper.ZKUtil.checkExists(ZKUtil.java:479)
        at org.apache.hadoop.hbase.zookeeper.ZKClusterId.readClusterIdZNode(ZKClusterId.java:65)
        at org.apache.hadoop.hbase.client.ZooKeeperRegistry.getClusterId(ZooKeeperRegistry.java:83)
        at org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.retrieveClusterId(HConnectionManager.java:801)
        at org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.<init>(HConnectionManager.java:633)
        ... 27 more
Caused by: java.lang.ClassNotFoundException: org.cloudera.htrace.Trace
        at java.net.URLClassLoader$1.run(URLClassLoader.java:366)
        at java.net.URLClassLoader$1.run(URLClassLoader.java:355)
        at java.security.AccessController.doPrivileged(Native Method)
        at java.net.URLClassLoader.findClass(URLClassLoader.java:354)
        at java.lang.ClassLoader.loadClass(ClassLoader.java:424)
        at java.lang.ClassLoader.loadClass(ClassLoader.java:357)
        ... 33 more

It seams that problem is in TableOutputFormat. So I checked that:

Here is the code which is makes problems:

public static void main(String args[]) throws Exception {

    Configuration conf = new Configuration();
    conf.set("hbase.zookeeper.quorum", "zookeeper_server1,zookeeper_server2,zookeeper_server3");
    conf.set(TableOutputFormat.OUTPUT_TABLE, "wp_json");

    String[] otherArgs = new GenericOptionsParser(conf, args).getRemainingArgs();
    String input = otherArgs[0];
    Job job = Job.getInstance(conf, "ELogHBaseImport");
    // Input is just text files in HDFS
    FileInputFormat.addInputPath(job, new Path(input));
    job.setJarByClass(ELogHBaseImport.class);
    job.setMapperClass(Map.class);
    job.setNumReduceTasks(0);
    job.setOutputFormatClass(TableOutputFormat.class);
    job.waitForCompletion(true);
}

When I use NullOutputFormat, everything works great but nothing is written to hbase.

The part of TableOutputFormat responsible for error is here:

163   /**
164    * Returns the output committer.
165    *
166    * @param context  The current context.
167    * @return The committer.
168    * @throws IOException When creating the committer fails.
169    * @throws InterruptedException When the job is aborted.
170    * @see org.apache.hadoop.mapreduce.OutputFormat#getOutputCommitter(org.apache.hadoop.mapreduce.TaskAttemptContext)
171    */
172   @Override
173   public OutputCommitter getOutputCommitter(TaskAttemptContext context)
174   throws IOException, InterruptedException {
175     return new TableOutputCommitter();
176   }
177 
178   public Configuration getConf() {
179     return conf;
180   }
181 
182   @Override
183   public void setConf(Configuration otherConf) {
184     this.conf = HBaseConfiguration.create(otherConf);
185 
186     String tableName = this.conf.get(OUTPUT_TABLE);
187     if(tableName == null || tableName.length() <= 0) {
188       throw new IllegalArgumentException("Must specify table name");
189     }
190 
191     String address = this.conf.get(QUORUM_ADDRESS);
192     int zkClientPort = this.conf.getInt(QUORUM_PORT, 0);
193     String serverClass = this.conf.get(REGION_SERVER_CLASS);
194     String serverImpl = this.conf.get(REGION_SERVER_IMPL);
195 
196     try {
197       if (address != null) {
198         ZKUtil.applyClusterKeyToConf(this.conf, address);
199       }
200       if (serverClass != null) {
201         this.conf.set(HConstants.REGION_SERVER_IMPL, serverImpl);
202       }
203       if (zkClientPort != 0) {
204         this.conf.setInt(HConstants.ZOOKEEPER_CLIENT_PORT, zkClientPort);
205       }
206       this.table = new HTable(this.conf, tableName);
207       this.table.setAutoFlush(false, true);
208       LOG.info("Created table instance for "  + tableName);
209     } catch(IOException e) {
210       LOG.error(e);
211       throw new RuntimeException(e);
212     }
213   }

Upvotes: 0

Views: 3956

Answers (1)

Maddy RS
Maddy RS

Reputation: 1031

Error is actually caused by this message:

Caused by: java.lang.ClassNotFoundException: org.cloudera.htrace.Trace*

Probably, you are missing a jar in the classpath. The class mentioned above may be indirectly referred from your code. Try to put the jar containing this class in classpath.

Hope this helps!!!

Upvotes: 1

Related Questions