Aillyn
Aillyn

Reputation: 23803

Implementing soft delete with minimal impact on performance and code

There are some similar questions on the topic, but they are not really helping me.

I want to implement a soft delete feature like on StackOverflow, where items are not really deleted, but just hidden. I am using a SQL database. Here are 3 options:

For both of the above

Another option is to create a separate table to hold deleted items:

Is there a better option?

Upvotes: 4

Views: 2896

Answers (5)

Elemental
Elemental

Reputation: 7515

I think your analysis of the options is good but you missed a few relevant points which I list below. Almost all implementations that I have seen use some sort of deleted or versioning field on the row as you suggest in your first two options.

Using one table with deleted flag: If your indexes all contain the deleted flag field first and your query's mostly contain a where isdeleted=false type structure then it DOES solve you performance problems and the indexes very efficiently exclude the deleted rows. Similar logic could be used for the deleted date option.

Using two Tables In general you need to make massive changes to reports because some reports may refer to deleted data (like old sales figures might refer to a deleted sales category). One can overcome this by creating a view which is a union of the two tables to read from and only write to the active records table.

Upvotes: 2

Serginho
Serginho

Reputation: 7490

Let's suppose we create a field called dead to mark deleted rows. We can create a index where field dead is false. In this way, we only search non-deleted rows using the hint use index.

Upvotes: 1

Sparky
Sparky

Reputation: 15105

If the key is numeric, I handle a "soft-delete" by negating the key. (Of course, won't work for identity keys). You don't need to change your code at all, and can easily restore the record by multiplying by -1.

Just another approach to give some thought to... If the key is alphanumeric, you can do something similar by prepending a unique "marker" characters. Since deleted records will all begin with this marker, then will end up off by themselves in the index.

Upvotes: 2

Rikon
Rikon

Reputation: 2706

I personally would base my answer off of how often you anticipate your users wanting to access that deleted data or "restore" that deleted data.

If it's often, then I would go with a "Date_Deleted" field and put a calculated "IsDeleted" in my poco in the code.

If it's never (or almost never) then a history table or deleted table is good for the benefits you explained.

I personally almost never use deleted tables (and opt for isDeleted or date_deleted) because of the potential risk to referencial integrity. You have A -> B and you remove the record from B database... You now have to manage referencial integrity because of your design choice.

Upvotes: 3

Oded
Oded

Reputation: 499152

In my opinion, the best way forward, when thinking about scaling and eventual table/database sizes is your third option - a separate table for deleted items. Such a table can eventually be moved to a different database to support scaling.

I believe you have listed the three most common options. As you have seen, each has advantages and disadvantages. Personally, I like taking the longer view on things.

Upvotes: 3

Related Questions