Reputation: 3793
As I'm designing a row key for my HBase table, I have two questions to ask
(consider we have only two regions)
To elaborate the question,
If I am inserting row keys starting with axx
, bxx
,...,zxx
does the HBase Master asssign ranges as a-m
in to one region and n-z
to another region ?
In another case If I'm inserting rowkeys starting only with axx
and bxx
, does it assign axx
to region one and bxx
to the other?
Upvotes: 0
Views: 1298
Reputation: 5531
Splitting does not occur in HBase until existing regions fill up. So if you set up an HBase cluster with 2 region servers, all data will only be added to one region initially. When that region fills up, data will be split across two regions based on whatever key is in the middle of the full region.
For your question 1.
, all keys would be added to one region initially. Assuming an even spread of keys, you should expect to see something close to a-m
in one and n-z
in another, after the first split occurs.
To show this graphically, assume our two regions can only store four rows each. After entering four records, you'd see:
REGION 1 REGION 2
+-----+ +-----+
| axx | | |
| bxx | | |
| cxx | | |
| dxx | | |
+-----+ +-----+
Now if we want to add axy
, it won't fit in REGION 1 and so splitting occurs across the middle of the region:
REGION 1 REGION 2
+-----+ +-----+
| axx | | cxx |
| bxx | | dxx |
| | | |
| | | |
+-----+ +-----+
and finally our new record is added:
REGION 1 REGION 2
+-----+ +-----+
| axx | | cxx |
| axy | | dxx |
| bxx | | |
| | | |
+-----+ +-----+
PRE-SPLITTING
If you know your likely key distribution in advance and wish to avoid expensive automatic splits, you can pre-split when you create the table:
create 'animals', 'a', {SPLITS => ['e','m','r']}
This would create four regions, each containing data between 0-e
, e-m
, m-r
, r-z
.
Upvotes: 2