Ram
Ram

Reputation: 337

Accessing HBase table data from Hive based on Time Stamp

I have created a HBase by mentioning the default versions as 10

create 'tablename',{NAME => 'cf', VERSIONS => 10}

and inserted two rows(row1 and row2)

put 'tablename','row1','cf:id','row1id'
put 'tablename','row1','cf:name','row1name'
put 'tablename','row2','cf:id','row2id'
put 'tablename','row2','cf:name','row2name'
put 'tablename','row2','cf:name','row2nameupdate'
put 'tablename','row2','cf:name','row2nameupdateagain'
put 'tablename','row2','cf:name','row2nameupdateonemoretime'

Tried to select the data using scan

scan 'tablename',{RAW => true, VERSIONS => 10}

I'm able to see all the versions data.

Now created a Hive External table to point to this HBase table

CREATE EXTERNAL TABLE hive_timestampupdate(key int, value string)
STORED BY 'org.apache.hadoop.hive.hbase.HBaseStorageHandler'
WITH SERDEPROPERTIES ("hbase.columns.mapping" = ":key,cf:name")
TBLPROPERTIES ("hbase.table.name" = "tablename");

When I queried the table hive_timestampupdate, I'm able to see the data in HBase table.

select * from hive_timestampupdate;

Here I want to query the data based on timestamp. Is there a way to query the data based on timestamp of HBase table?

Upvotes: 1

Views: 1168

Answers (1)

user2913094
user2913094

Reputation: 947

Unfortunately, no. According to the Hive HBase Integration document,

there is currently no way to access the HBase timestamp attribute, and queries always access data with the latest timestamp.

There are some JIRAs talking about timestamp related functionality, but they don't really do what you are asking, and they haven't gotten a great reception :(

Upvotes: 1

Related Questions