user2531569
user2531569

Reputation: 619

How row level deletes are handled in HBASE?

I am new bee in HBASE. So could someone please clarify my query on Row level deletes in HBase. Say we have 10 records in a table. So every record will be stored in separate HFile. So if we try to delete any record, it will delete the actual HFile. I understood, this is how row level deletes are handled in HBASE.

But during compaction Smaller HFiles will be converted to large HFile.

So all the data will be stored together in larger HFiles. Now, how row level deletes will be handled if all the data is stored together?

Upvotes: 1

Views: 638

Answers (2)

Mallik
Mallik

Reputation: 196

  1. HFile is not created as soon as you insert data. First the data is stored in memstore. Once the memstore is sufficiently large, it is flushed to HFile. New HFile is not created for every record or row. Also remember since records are stored in memory, they get sorted and then flushed to HFile. This is how records in HFiles are always sorted.
  2. HFiles are immutable [any files for that matter in HDFS are expected to be immutable]. Deletion of records does not happen right away. They are marked for deletion. And when the system runs compaction (Minor or Major), the records marked for deletion are actually deleted and the new HFile does not contain it. If the compaction is not initiated, the record still exists. However, it is masked from displaying whenever queried for.

Upvotes: 1

Zoltan
Zoltan

Reputation: 3105

Basically it just gets marked for deletion and the actual deletion happens during the next compaction. Please see the Deletion in HBase article for details.

Upvotes: 1

Related Questions