Matías Fidemraizer
Matías Fidemraizer

Reputation: 64933

Entity with versioning: optimal way of getting a list of last version using SQL

The question is about SQL on SQL Server 2012.

Background:

There is an entity called Entity.

Entity can have N EntityVersion (1:n).

EntityVersion may be approved or not. If it is approved, an attribute ApprovedBy will be assigned in the whole version.

Requirement:

There is the requirement of listing the latest entity versions of all entities, but only the latest approved version of each entity.

Problem:

I need to find an optimal way of achieveing the whole query for a large-scale solution.

For that reason, I have thought two possible approaches:

  1. IsLast flag column on EntityVersion. Whenever a new version is added, the whole transaction will set true to the newly added version, and the previously last version will be assigned with false.

  2. A trigger that will handle every new entity version insertion and will add the new version to a special table which will have the latest versions only. Whenever a new version is added, the previous is dropped from the whole special table. That is, the SELECT to obtain the whole listing will be very cheap.

Question:

What could be the optimal and valid approach? Do you have any other approach in mind?

Thanks!

Note that "optimal" means that will be scalable and it will work with milions of records.



Update

Since I've found that some users in their answers are requesting how is the schema of Entityand EntityVersion, I'm going to describe them better.

The easiest solution would be having something like a SELECT DISTINCT that could both define the distinct column and select other columns in the same query. Sadly, as far as I know, this isn't possible in SQL Server (am I wrong?).

Upvotes: 3

Views: 213

Answers (1)

RAS
RAS

Reputation: 3385

I think that IsLast flag column is sufficient, but....

  1. First question you should answer is: what's the ratio of Entities to EntityVersions? This can influence your decision.
  2. Also, I'm not sure on your table structure, but I would Assume that EntityVersion table is a copy of Entity table that stores changed fields (Or just diffs maybe?). If that's a case, why not store the latest version in Entity itself. I think this is a more semantically correct approach.
  3. Usually, if you have correct indexes (Maybe try to incorporate columnstoreIndexes?) you don't have to perform any special work like triggers which can be expensive. Try to do your tests with regular Join with GroupBy (Or you don't even need a join if your EntityVersion table has all of necessary fields). Or probably this approach is going to be a little faster: https://stackoverflow.com/a/438990/1792936
  4. Again, you can only construct an optimal query based on your execution plan and speed tests.

  5. If everything else fails, maybe try to incorporate CQRS pattern or something similar

Upvotes: 1

Related Questions