James
James

Reputation: 1001

Will HBase store column families for the same row in different machine?

Column families for the same row belong to the same RegionServer. So, the question here is will a RegionServer store different column families in different machine?

Upvotes: 8

Views: 2636

Answers (2)

haosdent
haosdent

Reputation: 1035

If the data in table is big enough, HBase will split the table to different regions. Because HBase is a column-oriented DB, different column families will store in different regions.

Upvotes: 0

zillion1
zillion1

Reputation: 51

Not neccessarily, but at some point it will. This is part of the basic HBase architecture. If you imaging a HBase table as being a spreadsheet, with its rows and columns, then a region spans multiple successive rows in one direction and all columns of one or more column family. This way, the whole sheet is covered with region tiles.

Each region is stored on one or more (typically three) cluster nodes. (If you'd loose all nodes containing a specific region at once you'd loose all the region's data. If you'd only loose one replica, HBase makes sure it is replicated to another node from the remaining copies.)

Now, when the data contained in a region grows too big, a region split is automatically initiated by HBase, resulting in two new regions, each containing on half of the data. Only through region splits (besides region replication) data gets distributed over a HBase cluster eventually.

Storing data for one row in different columns of the same column family assures that the data is stored together at one place.

Upvotes: 5

Related Questions