Reputation: 4236
Say I have blog post comments. On insert they get the current utc date time as their creation time (via sysutcdatetime default value) and they get an ID (via integer identity column as PK).
Now I want to sort the comments descending by their age. Is it safe to just do a ORDER BY ID
or is it required to use the creation time? I'm thinking about "concurrent" commits and rollbacks of inserts and an isolation level of read committed. Is it possible that the IDs sometimes do not represent the insert order?
I'm asking this because if sorting by IDs is safe then I could have the following benefits:
This answer says it is possible when you don't have the creation time but is it always safe?
This answer says it is not safe with an identity column. But when it's also the PK the answer gives an example with sorting by ID without mentioning if this is safe.
Edit:
This answer suggests sorting by date and then by ID.
Upvotes: 3
Views: 1783
Reputation: 32695
I would order by ID
.
Technically you may get different results when sorting by ID vs sorting by time.
The sysutcdatetime
will return the time when transaction starts. ID
could be generated somewhere later during the transaction. Also, the clock on any computer always drifts. When computer clock is synchronized with the time source, the clock may jump forward or backwards. If you do the sync often, the jump will be small, but it will happen.
From the practical point of view, if two comments were posted within, say, one second of each other, does it really matter which of these comments is shown first?
What I think does matter is the consistency of the display results. If the system somehow decides that comment A should go before comment B, then this order should be preserved everywhere across the system.
So, even with the highest precision datetime2(7)
column it is possible to have two comments with exactly the same timestamp and if you order just by this timestamp it is possible that sometimes they will appear as A, B
and sometimes as B, A
.
If you order by ID
(primary key), you are guaranteed that it is unique, so the order will be always well defined.
I would order by ID
.
On a second thought, I would order by time and ID.
If you show the time of the comment to the user it is important to show comments according to this time. To guarantee consistency sort by both time and ID in case two comments have the same timestamp.
Upvotes: 2
Reputation: 171178
Yes, the IDs can be jumbled because ID generation is not part of the insert transaction. This is in order to not serialize all insert transactions on the table.
The most correct way to sort would be ORDER BY DateTime DESC, ID DESC
with the ID being added as a tie breaker in case the same date was generated multiple times. Tie breakers in sorts are important to achieve deterministic results. You don't want different data to be shown for multiple refreshes of the page for example.
You can define a covering index on DateTime DESC, ID DESC
and achieve the same performance as if you had ordered by the CI key (here: ID
). There's no relevant physical difference between CI and NCIs.
Since you mention the PK somewhere I want to point out that the choice of the PK does not affect any of this. Only indexes do. The query processor does not ever care about PKs and unique keys.
Upvotes: 4
Reputation: 844
if you sort on id based on descending order and you are filtering on basis of user then your blog will automatically show latest post on above and that will do the job for you. so dont use date as sorting
Upvotes: 0