leon
leon

Reputation: 10395

Compression in HBase

I am using HBase to store a lot of sensor data.

I have tried to use a txt file to store my sensor data, for a 20MB file, if I compress it, it will reduce to 1MB on disk.

My question is: Does HBase itself do compression automatically when storing the data to disks?

Thanks

Upvotes: 4

Views: 5303

Answers (2)

Adrien M.
Adrien M.

Reputation: 311

You can also alter your table to add compression support later. Then your data will be compressed for real at the next compaction (as ali said, because a new HFile will be written to disk). As far as I understand, compression algorithm is used at the block-level, not at the whole HFile. That mean that when reading data, it won't have to uncompress a several-GBs HFile but only a few KBs data block.

Upvotes: 1

ali haider
ali haider

Reputation: 20242

you can use lzo, gzip or snappy for hbase compression. You will need to set lzo/snappy yourself if you wish to use them for hbase compression (gzip is included).

normally - lzo is faster than gzip compression though gzip compression ratio normally be better. Snappy is robust with compression but compression ratios are normally worse.

When creating a table - you can specify compression/compression library - hfiles are compressed when written to disk if compression is used (and need to be decompressed when reading).

hope it helps

Upvotes: 2

Related Questions