Reputation: 5184
i have the following scenario in my hbase instance
hbase(main):002:0> create 'test', 'cf'
0 row(s) in 1.4690 seconds
hbase(main):003:0> put 'test', 'row1', 'cf:a', 'value1'
0 row(s) in 0.1480 seconds
hbase(main):004:0> put 'test', 'row2', 'cf:b', 'value2'
0 row(s) in 0.0070 seconds
hbase(main):005:0> put 'test', 'row3', 'cf:c', 'value3'
0 row(s) in 0.0120 seconds
hbase(main):006:0> put 'test', 'row3', 'cf:c', 'value4'
0 row(s) in 0.0070 seconds
Now if you will see, the last two inserts are for the same column family, same column and same key. But if i understand hbase properly cf:c+row3 represent a cell which will have all timestamped versions of inserted value.
But a simple scan return only recent value
hbase(main):010:0> scan 'test'
ROW COLUMN+CELL
row1 column=cf:a, timestamp=1317945279379, value=value1
row2 column=cf:b, timestamp=1317945285731, value=value2
row3 column=cf:c, timestamp=1317945301466, value=value4
3 row(s) in 0.0250 seconds
How do i get all timestamped values for a cell, or how to perform time range based query?
Upvotes: 16
Views: 30197
Reputation: 587
To change the number of versions allowed in a column family use the following command:
alter 'test', NAME=>'cf', VERSIONS=>2
then add another entry:
put 'test', 'row1', 'cf:a2', 'value1e'
then see the different versions:
get 'test', 'row1', {COLUMN => 'cf:a2', VERSIONS => 2}
would return something like:
COLUMN CELL
cf:a2 timestamp=1457947804214, value=value1e
cf:a2 timestamp=1457947217039, value=value1d
2 row(s) in 0.0090 seconds
Here is a link for more details: https://learnhbase.wordpress.com/2013/03/02/hbase-shell-commands/.
Upvotes: 2
Reputation: 139
The row key 'row3' of cf:c for value4 should be unique otherwise it gets overwritten:
hbase(main):052:0> scan 'mytable' , {COLUMN => 'cf1:1', VERSION => 3}
ROW COLUMN+CELL
1234 column=cf1:1, timestamp=1405796300388, value=hello
1 row(s) in 0.0160 seconds
hbase(main):053:0> put 'mytable', 1234, 'cf1:1', 'wow!'
0 row(s) in 0.1020 seconds
Column 1 of cf1 having a value of 'hello' is overwritten by second put with same row key 1234 and a value of 'wow!'
hbase(main):054:0> scan 'mytable', {COLUMN => 'cf1:1', VERSION => 3}
ROW COLUMN+CELL
1234 column=cf1:1, timestamp=1405831703617, value=wow!
2 row(s) in 0.0310 seconds
Now the second insert contained a new value 'hey' for column 1 of cf1 and the scan query for last 3 versions now shows 'wow!' and 'hey', please not the versions are displayed on descending order.
hbase(main):055:0> put 'mytable', 123, 'cf1:1', 'hey'
hbase(main):004:0> scan 'mytable', {COLUMN => 'cf1:1', VERSION => 3}
ROW COLUMN+CELL
123 column=cf1:1, timestamp=1405831295769, value=hey
1234 column=cf1:1, timestamp=1405831703617, value=wow!
Upvotes: 1
Reputation: 4128
In order to see versions of a column you need to give the version count.
scan 'test', {VERSIONS => 3}
will give you 2 versions of columns if they are available. you can use it in get aswell :
get 'test', 'row3', {COLUMN => 'cf:c', VERSIONS => 3}
for getting the value of a spesific time you can use TIMESTAMP aswell.
get 'test', 'row3', {COLUMN => 'cf:c', TIMESTAMP => 1317945301466}
if you need to get values "between" 2 timestamps you should use TimestampsFilter.
Upvotes: 30