DShook
DShook

Reputation: 15664

Query slow for certain criteria on clustered index

I have a table called readings that has > 76 million rows in it that I'm running this query on:

declare @tunnel_id int = 13
SELECT TOP 1 local_time, recorded_time
FROM readings
WHERE tunnel_id = @tunnel_id 
ORDER BY id DESC

The id column is a bigint, set as the primary key, and has a clustered index, and there is also an index on the tunnel_id field.

The works great and returns in less than a second for about 16 out of the 20 different tunnel_id's I'm trying. However, on the last 4 or so the query takes 40 seconds and uses hundreds of thousands of reads.

I tried modifying the query into this:

SELECT TOP (1) local_time, recorded_time
FROM readings
where id = (
    SELECT TOP 1 id
    FROM readings
    WHERE tunnel_id = 13
    ORDER BY id DESC
)

Which once again is only slow for a few tunnel_id's. What perplexes me more is that the inner select runs quickly for the slow id's and if I hardcode the maximum id instead of the subquery it also runs quickly.

What am I missing here that's making this query perform poorly?

Edit for comments:

Tunnel_id is not unique, each tunnel has multiple millions of rows. This is running on Sql Server 2012.

I included the actual execution plans from both the fast and slow runs and they are identical.

Fast: sql fast execution plan

Slow:

sql fast execution plan

But as you can see, the first executes in less than a second while the second takes 51 seconds.

Upvotes: 0

Views: 1257

Answers (3)

DShook
DShook

Reputation: 15664

Just found that you can hint to use the tunnel_id index:

declare @tunnel_id int = 13
SELECT TOP 1 local_time, recorded_time
FROM readings
WITH (INDEX(idx_tunnel_id))
WHERE tunnel_id = @tunnel_id 
ORDER BY id DESC

which works as expected and returns in less than 1 second.

Upvotes: 0

Alejandro
Alejandro

Reputation: 7813

The interesting part here is that SQL isn't using the index in tunnel_id at all and is just scanning the table in whole, which is slow if it's big like 76 millions rows. I think the real cause it isn't using it is because the ordering by id, as it must perform a lookup and then an additional sorting. I doubt at first that parameter sniffing is the main problem here.

I would try to change the index instead, and make it covering. If possible include in the index the local time, recorded time and the id (not 100% sure if it's needed as it's the cluster key anyway).

CREATE NONCLUSTERED INDEX IX_tunnel_id ON dbo.readings (tunnel_id) INCLUDE (id, local_time, recorded_time)

Note that, while this can improve this particular query, it will make inserts and updates a little slower, and require additional storage space.

Upvotes: 1

Vlad G.
Vlad G.

Reputation: 2147

The plan basically scans the entire clustered index from start to end and looks for the first row with tunnel_id = @tunnel_id.

My educated guess is that the 'slow' tunnels don't have any rows in the beginning of the clustered index and so it has to scan more of it.

This non-clustered index should speed things up:

CREATE NONCLUSTERED INDEX [IX_FOO] ON [readings]
(
    tunnel_id, 
    ID 
)
INCLUDE 
(
    local_time, 
    recorded_time
)

This could replace the existing index on tunnel_id.

Upvotes: 1

Related Questions