Reputation: 853
The query below accesses the Votes table that contains over 30 million rows. The result set is then selected from using WHERE n = 1
. In the query plan, the SORT operation in the ROW_NUMBER() windowed function is 95% of the query's cost and it is taking over 6 minutes to complete execution.
I already have an index on same_voter, eid, country include vid, nid, sid, vote, time_stamp, new
to cover the where clause.
Is the most efficient way to correct this to add an index on vid, nid, sid, new DESC, time_stamp DESC
or is there an alternative to using the ROW_NUMBER() function for this to achieve the same results in a more efficient manner?
SELECT v.vid, v.nid, v.sid, v.vote, v.time_stamp, v.new, v.eid,
ROW_NUMBER() OVER (
PARTITION BY v.vid, v.nid, v.sid ORDER BY v.new DESC, v.time_stamp DESC) AS n
FROM dbo.Votes v
WHERE v.same_voter <> 1
AND v.eid <= @EId
AND v.eid > (@EId - 5)
AND v.country = @Country
Upvotes: 1
Views: 1287
Reputation: 47464
One possible alternative to using ROW_NUMBER()
:
SELECT
V.vid,
V.nid,
V.sid,
V.vote,
V.time_stamp,
V.new,
V.eid
FROM
dbo.Votes V
LEFT OUTER JOIN dbo.Votes V2 ON
V2.vid = V.vid AND
V2.nid = V.nid AND
V2.sid = V.sid AND
V2.same_voter <> 1 AND
V2.eid <= @EId AND
V2.eid > (@EId - 5) AND
V2.country = @Country AND
(V2.new > V.new OR (V2.new = V.new AND V2.time_stamp > V.time_stamp))
WHERE
V.same_voter <> 1 AND
V.eid <= @EId AND
V.eid > (@EId - 5) AND
V.country = @Country AND
V2.vid IS NULL
The query basically says to get all rows matching your criteria, then join to any other rows that match the same criteria, but which would be ranked higher for the partition based on the new
and time_stamp
columns. If none are found then this must be the row that you want (it's ranked highest) and if none are found that means that V2.vid
will be NULL
. I'm assuming that vid
otherwise can never be NULL
. If it's a NULL
able column in your table then you'll need to adjust that last line of the query.
Upvotes: 1