Reputation: 1001
Column families for the same row belong to the same RegionServer. So, the question here is will a RegionServer store different column families in different machine?
Upvotes: 8
Views: 2636
Reputation: 1035
If the data in table is big enough, HBase will split the table to different regions. Because HBase is a column-oriented DB, different column families will store in different regions.
Upvotes: 0
Reputation: 51
Not neccessarily, but at some point it will. This is part of the basic HBase architecture. If you imaging a HBase table as being a spreadsheet, with its rows and columns, then a region spans multiple successive rows in one direction and all columns of one or more column family. This way, the whole sheet is covered with region tiles.
Each region is stored on one or more (typically three) cluster nodes. (If you'd loose all nodes containing a specific region at once you'd loose all the region's data. If you'd only loose one replica, HBase makes sure it is replicated to another node from the remaining copies.)
Now, when the data contained in a region grows too big, a region split is automatically initiated by HBase, resulting in two new regions, each containing on half of the data. Only through region splits (besides region replication) data gets distributed over a HBase cluster eventually.
Storing data for one row in different columns of the same column family assures that the data is stored together at one place.
Upvotes: 5