Ali
Ali

Reputation: 1869

Hbase: Having just the first version of each cell

I was wondering how can I configure Hbase in a way to store just the first version of each cell? Suppose the following Htable:

row_key          cf1:c1           timestamp
----------------------------------------
1                  x                 t1

After putting ("1","cf1:c2",t2) in the scenario of ColumnDescriptor.DEFAULT_VERSIONS = 2 the mentioned Htable becomes:

row_key          cf1:c1           timestamp
----------------------------------------
1                  x                 t1
1                  x                 t2

where t2>t1.

My question would be how can I change this scenario in a way that the first version of cell would be the only version that could be store and retrieve. I mean in the provided example the only version would be 't1' one! Thus, I want to change hbase in a way that ignore insertion on duplicates.

I know that setting VERSIONS to 1 for Htable and putting based on Long.MAX_VALUE - System.currentTimeMillis() would solve my problem but I dont know is it the best solution or not?! What is the concern of changing tstamp to Long.MAX_VALUE - System.currentTimeMillis()? Does it has any performance issue?

Upvotes: 3

Views: 741

Answers (2)

Lodewijk Bogaards
Lodewijk Bogaards

Reputation: 19987

There are two strategies that I can think of:

1. One version + inverted timestamp

Setting VERSIONS to 1 for Htable and putting based on Long.MAX_VALUE - System.currentTimeMillis() will generally work and does not have any major performance issues.

On write:

  • When multiple versions of the same cell are written to hbase, at any point in time, all versions will be written (without any impact on performance). After compaction only the cell with the highest timestamp will survive.
  • The cell with the highest timestamp in this scheme is the one written by the client with the lowest value for System.currentTimeMillis(). It should be noted that this might not actually be the machine who tried to write to the cell first, since hbase clients might be out of sync.

On read:

  • When multiple versions of the same cell are found pruning will occur at that time. This can happen at any time, since your writes can occur at any time, even after compaction. This has a very slight impact on performance.

2. checkAndPut

To get true ordering through atomicity, meaning only the first write to reach the region server will succeed, you can use the checkAndPut operation:

From the docs:

public boolean checkAndPut(byte[] row, byte[] family, byte[] qualifier, byte[] value, Put put) throws IOException

Atomically checks if a row/family/qualifier value matches the expected value. If it does, it adds the put. If the passed value is null, the check is for the lack of column (ie: non-existance)`

So by setting value to null your Put will only succeed if the cell did not exist. If your Put succeeded then the return value will be true. This gives true atomicity, but at a write performance cost.

On write:

  • A row lock is set and a Get is issued internally before existance is checked. Once non-existance is confirmed the Put is issued. As you can imagine this has a pretty big performance impact for each write, since each write now also involves a read and a lock.
  • During compaction nothing needs to happen, because only one Put will ever make it to hbase. Which is always the first Put to reach the region server.
  • It should be noted that there is no way to batch these kind of checkAndPut operations by using checkAndMutate, since each Put needs it own check. This means each put needs to be a separate request, which means you will be paying a latency cost as well when writing in batches.

On read:

  • Only ever one version will make it to Hbase, so there is no impact here.

Picking between strategies:

If true ordering really matters or you may need to read each row after or before you write to hbase anyway (for example to find out if your write succeeded or not), you're better of with strategy 2, otherwise, in all other cases, I'd recommend strategy 1, since its write performance is much better. In that case just make sure your clients are properly time synced.

Upvotes: 3

Sergei Rodionov
Sergei Rodionov

Reputation: 4529

You can insert the Put with Long.MAX_VALUE - timestampand configure the table to store only 1 version (max versions => 1). This way only the first (earliest) Put will be returned by the Scan because all successive Puts will have a smaller timestamp value.

Upvotes: 0

Related Questions