Reputation: 15664
I have a table called readings that has > 76 million rows in it that I'm running this query on:
declare @tunnel_id int = 13
SELECT TOP 1 local_time, recorded_time
FROM readings
WHERE tunnel_id = @tunnel_id
ORDER BY id DESC
The id column is a bigint, set as the primary key, and has a clustered index, and there is also an index on the tunnel_id field.
The works great and returns in less than a second for about 16 out of the 20 different tunnel_id's I'm trying. However, on the last 4 or so the query takes 40 seconds and uses hundreds of thousands of reads.
I tried modifying the query into this:
SELECT TOP (1) local_time, recorded_time
FROM readings
where id = (
SELECT TOP 1 id
FROM readings
WHERE tunnel_id = 13
ORDER BY id DESC
)
Which once again is only slow for a few tunnel_id's. What perplexes me more is that the inner select runs quickly for the slow id's and if I hardcode the maximum id instead of the subquery it also runs quickly.
What am I missing here that's making this query perform poorly?
Edit for comments:
Tunnel_id is not unique, each tunnel has multiple millions of rows. This is running on Sql Server 2012.
I included the actual execution plans from both the fast and slow runs and they are identical.
Fast:
Slow:
But as you can see, the first executes in less than a second while the second takes 51 seconds.
Upvotes: 0
Views: 1257
Reputation: 15664
Just found that you can hint to use the tunnel_id index:
declare @tunnel_id int = 13
SELECT TOP 1 local_time, recorded_time
FROM readings
WITH (INDEX(idx_tunnel_id))
WHERE tunnel_id = @tunnel_id
ORDER BY id DESC
which works as expected and returns in less than 1 second.
Upvotes: 0
Reputation: 7813
The interesting part here is that SQL isn't using the index in tunnel_id at all and is just scanning the table in whole, which is slow if it's big like 76 millions rows. I think the real cause it isn't using it is because the ordering by id, as it must perform a lookup and then an additional sorting. I doubt at first that parameter sniffing is the main problem here.
I would try to change the index instead, and make it covering. If possible include in the index the local time, recorded time and the id (not 100% sure if it's needed as it's the cluster key anyway).
CREATE NONCLUSTERED INDEX IX_tunnel_id ON dbo.readings (tunnel_id) INCLUDE (id, local_time, recorded_time)
Note that, while this can improve this particular query, it will make inserts and updates a little slower, and require additional storage space.
Upvotes: 1
Reputation: 2147
The plan basically scans the entire clustered index from start to end and looks for the first row with tunnel_id = @tunnel_id.
My educated guess is that the 'slow' tunnels don't have any rows in the beginning of the clustered index and so it has to scan more of it.
This non-clustered index should speed things up:
CREATE NONCLUSTERED INDEX [IX_FOO] ON [readings]
(
tunnel_id,
ID
)
INCLUDE
(
local_time,
recorded_time
)
This could replace the existing index on tunnel_id.
Upvotes: 1