animal
animal

Reputation: 1004

Moving data to HBASE using Pig

I tried moving 851 data in my hbase for that i created hbase using below command

create 'customers', 'customers_data'

i moved the files using pig script. My pig script is

STOCK_A = LOAD '/user/cloudera/xxx' USING PigStorage('|');
data = FILTER STOCK_A BY ( $0 matches '.*MH.*');
MH_DATA = FOREACH data GENERATE $1, $3, $4;
STORE MH_DATA into 'hbase://customers' USING org.apache.pig.backend.hadoop.hbase.HBaseStorage('customers_data:firstname, customers_data:lastname, customers_data:age');

i got 851 data using my pig command. My data is

    (aman,george,22)
    (aman,george,22)
    (aman,george,22)
     .
     .
     .
     .
     .
    851 

but when i try to put this data in hbase using below command

PIG_CLASSPATH=/usr/lib/hbase/hbase.jar:/usr/lib/zookeeper/zookeeper-3.4.5-cdh4.4.0.jar /usr/bin/pig /home/cloudera/remot/pighl7

data that is getting stored in HBASE is

ROW                                         COLUMN+CELL                                                                                                                 
 \xB5~\x5C&                                 column=customers_data:firstname, timestamp=1478700582076, value=george
 \xB5~\x5C&                                 column=customers_data:lastname, timestamp=1478700582076, value=22

I cant find my 851 records as well as the third parameter. I don't know what i am doing wrong. Please help

Upvotes: 0

Views: 90

Answers (2)

animal
animal

Reputation: 1004

After doing a lot of research and trail and error when i changed the row key from name to timestamp i solved my problem, As i am using using row key which is having same name as of others it always updates it.

Upvotes: 1

Rijul
Rijul

Reputation: 1445

I think you have missed giving alias in the generate statement (for safer side i have casted your tuples into chararray)

also at the end give name for you store relation

TRY:

MH_DATA = FOREACH data GENERATE (chararray)$1 AS firstname , (chararray)$3 AS lastname, (chararray)$4 AS age;

STORE_IN_HBASE = STORE MH_DATA into 'hbase://customers' USING org.apache.pig.backend.hadoop.hbase.HBaseStorage('customers_data:firstname, customers_data:lastname, customers_data:age');

for more information follow this link: https://pig.apache.org/docs/r0.14.0/api/org/apache/pig/backend/hadoop/hbase/HBaseStorage.html

Upvotes: 1

Related Questions