HBase:Difference between Minor and Major Compaction

Question

I am having trouble understanding why major compaction is differ from minor compaction. As far as I know, minor compaction is that merge some HFiles into one or little more HFiles.

And I think major compaction does almost the same thing except handling deleted rows..

So, I have no idea why major compaction brings back data locality of HBase(when it is used over HDFS).

In other words, why minor compaction cannot restore data locality, despite the fact that for me, minor compaction and major compaction is all just merging HFiles into small amount of HFiles.

And why only major compaction dramatically improves read performance? I think minor compaction also contributes to the improvement of read performance.

Please help me to understand.

Thank you in advance.

Ashu Pachauri · Accepted Answer

Before understanding the difference between major and minor compactions, you need to understand the factors that impact performance from the point of view of compactions:

Number of files: Too many files negatively impact performance, due to file metadata and seek costs associated with each file.
Amount of data: Too much data means less performance. Now, this data could be useful or useless i.e. mostly consisting of what HBase calls Delete markers. These delete markers are used by HBase to mark a Cell/KeyValue that might be contained in an older Hfile as deleted.
Data locality: Since HBase regionserver are stateless processes, and the data is actually stored in HDFS, the data that a region server serves could be on a different physcial machine. How much of a regionserver's data is on the same machine counts towards data locality. While writing data, regionserver try to write the primary copy of data in the local HDFS data node. So, the cluster has a data locality of 100% or 1. But, due to regionserver restarts or region rebalancing or region splitting, the regions can move to a different machine than they originally started on thus reducing locality. Higher locality means better IO performance as HBase can then use something called short-circuit reads.

As you can imagine, the chances of having a poor locality for older data are higher due to restarts and rebalances.

Now, an easy way to understand the difference between minor and major compactions is as follows:

Minor Compaction: This compaction type is running all the time and focusses mainly on new files being written. By the virtue of being new, these files are small and can have delete markers for data in older files. Since this compaction is only looking at relatively newer files, it does not touch/delete data from older files. This means that until a different compaction type comes and deletes older data, this compaction type cannot remove the delete markers even from the newer files, otherwise those older deleted KeyValues will become visible again.

This leads to two outcomes:

As the files being touched are relatively newer and smaller, the capability to impact data locality is very low. In fact, during a write operation, a region server tries to write the primary replica of data on the local HDFS data node anyway. So, a minor compaction usually does not add much value to data locality.
Since the delete markers are not removed, some performance is still left on the table. That said, minor compactions are critical for HBase read performance as they keep the total file count under control which could be a big performance bottleneck especially on spinning disks if left unchecked.

Major Compaction: This type of compaction runs rarely (once a week by default) and focusses on complete cleanup of a store (one column family inside one region). The output of a major compaction is one file for one store. Since a major compaction rewrites all the data inside a store, it can remove both the delete markers and the older KeyValues marked as deleted by those delete markers.

This also leads to two outcomes:

Since delete markers and deleted data is physically removed, file sizes are reduced dramatically, especially in a system receiving a lot of delete operations. This can lead to a dramatic increase in performance in a delete-heavy environment.
Since all data of a store is being rewritten, it's a chance to restore the data locality for older (and larger) files also where the drift might have happened due to restarts and rebalances as explained earlier. This leads to better IO performance during reads.

HBase:Difference between Minor and Major Compaction

Answers (1)

Related Questions