dino
dino

Reputation: 239

How transfer a Table from HBase to Hive?

How can I tranfer a HBase table into Hive correctly?

What I tried before can you read in this question How insert overwrite table in hive with diffrent where clauses? ( I made one table to import all data. The problem here is that data is still in rows and not in columns. So I made 3 tables for news, social and all with a specific where clause. After that I made 2 Joins on the tables which is giving me the result table. So I had 6 Tables at all which is not really performant!)

to sum my problem up : In HBase are column familys which are saved as rows like this.

count   verpassen   news    1
count   verpassen   social  0
count   verpassen   all 1

What I want to achieve in Hive is a datastructure like this:

name      news    social   all
verpassen 1       0        1

How am I supposed to do this?

Upvotes: 1

Views: 5383

Answers (1)

yoga
yoga

Reputation: 1959

Below is the approach use can use.

use hbase storage handler to create the table in hive

example script

CREATE TABLE hbase_table_1(key string, value string) STORED BY 'org.apache.hadoop.hive.hbase.HBaseStorageHandler' WITH SERDEPROPERTIES ("hbase.columns.mapping" = ":key,f1:val") TBLPROPERTIES ("hbase.table.name" = "test");

I loaded the sample data you have given into hive external table.

enter image description here

select name,collect_set(concat_ws(',',type,val)) input from TESTTABLE group by name ;

i am grouping the data by name.The resultant output for the above query will be enter image description here

Now i wrote a custom mapper which takes the input as input parameter and emits the values.

from (select '["all,1","social,0","news,1"]' input from TESTTABLE group by name) d MAP d.input Using 'python test.py' as all,social,news

enter image description here

alternatively you can use the output to insert into another table which has column names name,all,social,news

Hope this helps

Upvotes: 1

Related Questions