hba
hba

Reputation: 7780

How does HBase calculate the flush size?

I am trying to better understand memstore flush algorithm in HBase.

I have a simple (snappy-compressed) table with 1 column family and I have configured HBase as follows (I have a couple of regions on this region server):

Based on the logs it seems like flushes are happening at 70mb mark what i see in the logs repeatedly is something similar to this

DefaultStoreFlusher Flushed memstore data size=68.14 MB at sequenceid=12561

Why not 128 mb?

Upvotes: 1

Views: 1066

Answers (2)

HPKG
HPKG

Reputation: 345

Data size is sum of cell data alone (key bytes + value bytes). This is the actual data that will be flushed to Hfile. But heap usage for the same data is usually more. Along with cell's data, it includes the metadata and index. Flush happens when heap size reaches hbase.hregion.memstore.flush.size. Log might call that out.

Upvotes: 2

badger
badger

Reputation: 3246

now hbase.regionserver.global.memstore.upperLimit is deprecated by hbase.regionserver.global.memstore.size

if the size of one memstore reaches hbase.hregion.memstore.flush.size then all memstores in the region will be flushed(even those are less than 128 mb) also there is a region server setting that trigger flushing that adjusted by hbase.regionserver.global.memstore.size and hbase.regionserver.global.memstore.size.lower.limit, if the sum of all memstore sizes in a region server exceed Heap * hbase.regionserver.global.memstore.size.lower.limit * hbase.regionserver.global.memstore.size then all memstores in the region will be flushed

Upvotes: 0

Related Questions