Reputation: 5785
I have read in HBase Book that it's now possible to delete rows from table.
How does it exactly works? Is this data moved somewhere for later deletion?
HBase is limited by HDFS's limitation in editing once written files, so I'm curious how does it work. If anyone knows something more about it please share you knowledge.
Thanks.
Upvotes: 3
Views: 3779
Reputation: 6424
I found some useful info at http://hbase.apache.org/book.html#d705e2948
A extract from that section
Deletes work by creating tombstone markers. For example, let's suppose we want to delete a row. For this you can specify a version, or else by default the currentTimeMillis is used. What this means is “delete all cells where the version is less than or equal to this version”. HBase never modifies data in place, so for example a delete will not immediately delete (or mark as deleted) the entries in the storage file that correspond to the delete condition. Rather, a so-called tombstone is written, which will mask the deleted values. If the version you specified when deleting a row is larger than the version of any value in the row, then you can consider the complete row to be deleted.
The row is 'flagged' as deleted and not included in the retrieved data, but the data is still there. When compaction occurs, the deleted data is removed.
Upvotes: 8