v1p3r
v1p3r

Reputation: 757

How to compose an HBase key with variable length constituents

Suppose my HBase table needs to be accessed by a key which is a combination of four different elements (K1:DateTime, K2:Int, K3:String, K4:Double). What is the best practice to construct the key for this? I am especially concerned about the variable length data types (string).

Currently I am appending the byte length to the string so I can parse back each of the elements from the key bytes. I was thinking that having length at the beginning would result in fast checks when the string lengths do not match. Are there any drawbacks to this approach. Could it somehow affect querying based on partial keys later? (I am fairly new to HBase having tinkered around with it for just a week)

Honestly I dont like strings to be part of the keys and I am trying to get the guys to use some kind of enumerations instead of strings, but not sure I'll be able to convince them. Assuming I am stuck with Strings being part of the key, what is the best approach to compose a key with these elements?

Upvotes: 0

Views: 242

Answers (1)

Arnab Bhowmik
Arnab Bhowmik

Reputation: 28

If your String 'K3' is unavoidable here, it can be kept as 10 MB size allocated implicitly in hBase per cell. Now the DateTime upto millisecond shall be captured here. Two methods are available to you for this scenario: RegexStringComparator and SubstringComparator . Kindly refer to their usages styles.

If length of the String provides a substantial filtering of data here, Kindly keep the same in beginning followed by regex between each elements. Use RegexStringComparator here. Otherwise provide the string in the beginning and use RegexStringComparator and pass the required data as param .

N.B: Digging solution would be easier if real data are provided.

Upvotes: 0

Related Questions