Tiger Chan
Tiger Chan

Reputation: 21

pig + hbase + hadoop2 integration

has anyone had successful experience loading data to hbase-0.98.0 from pig-0.12.0 on hadoop-2.2.0 in an environment of hadoop-2.20+hbase-0.98.0+pig-0.12.0 combination without encountering this error:

ERROR 2998: Unhandled internal error.
org/apache/hadoop/hbase/filter/WritableByteArrayComparable

with a line of log trace:

java.lang.NoClassDefFoundError: org/apache/hadoop/hbase/filter/WritableByteArra

I searched the web and found a handful of problems and solutions but all of them refer to pre-hadoop2 and base-0.94-x which were not applicable to my situation. I have a 5 node hadoop-2.2.0 cluster and a 3 node hbase-0.98.0 cluster and a client machine installed with hadoop-2.2.0, base-0.98.0, pig-0.12.0. Each of them functioned fine separately and I got hdfs, map reduce, region servers , pig all worked fine. To complete an "loading data to base from pig" example, i have the following export:

export PIG_CLASSPATH=$HADOOP_INSTALL/etc/hadoop:$HBASE_PREFIX/lib/*.jar
:$HBASE_PREFIX/lib/protobuf-java-2.5.0.jar:$HBASE_PREFIX/lib/zookeeper-3.4.5.jar

and when i tried to run : pig -x local -f loaddata.pig and boom, the following error:ERROR 2998: Unhandled internal error. org/apache/hadoop/hbase/filter/WritableByteArrayComparable (this should be the 100+ times i got it dying countless tries to figure out a working setting). the trace log shows:lava.lang.NoClassDefFoundError: org/apache/hadoop/hbase/filter/WritableByteArrayComparable the following is my pig script:

REGISTER /usr/local/hbase/lib/hbase-*.jar;
REGISTER /usr/local/hbase/lib/hadoop-*.jar;
REGISTER /usr/local/hbase/lib/protobuf-java-2.5.0.jar;
REGISTER /usr/local/hbase/lib/zookeeper-3.4.5.jar;
raw_data = LOAD '/home/hdadmin/200408hourly.txt' USING PigStorage(',');
weather_data = FOREACH raw_data GENERATE $1, $10;
ranked_data = RANK weather_data;
final_data = FILTER ranked_data BY $0 IS NOT NULL;
STORE final_data INTO 'hbase://weather' USING
org.apache.pig.backend.hadoop.hbase.HBaseStorage('info:date info:temp');

I have successfully created a base table 'weather'. Has anyone had successful experience and be generous to share with us?

Upvotes: 2

Views: 784

Answers (2)

David Kjerrumgaard
David Kjerrumgaard

Reputation: 1076

If you know which jar file contains the missing class, e.g. org/apache/hadoop/hbase/filter/WritableByteArray, then you can use the pig.additional.jars property when running the pig command to ensure that the jar file is available to all the mapper tasks.

pig -D pig.additional.jars=FullPathToJarFile.jar bulkload.pig

Example:

pig -D pig.additional.jars=/usr/lib/hbase/lib/hbase-protocol.jar bulkload.pig

Upvotes: 0

bridiver
bridiver

Reputation: 1714

ant clean jar-withouthadoop -Dhadoopversion=23 -Dhbaseversion=95

By default it builds against hbase 0.94. 94 and 95 are the only options.

Upvotes: 1

Related Questions