IPSUS
IPSUS

Reputation: 489

Efficient using of row_number over multiple columns

I'm trying to implement efficient user ratings in SQL Server 2008 R2 where records are constantly changing and each write of user data leads to subsequent read of rating which is just a ROW_NUMBER over multiple columns.

CREATE TABLE [dbo].[Scores]
(
    [Id] int NOT NULL IDENTITY (1, 1),
    [UserId] int NOT NULL,
    [MaxLevel] int NOT NULL,
    [BestDiff] int NOT NULL,
    [BestDiffGames] int NOT NULL,
    [BestDiffLastDate] datetime NOT NULL,
    [MaxLevelLastWinDate] datetime,

    -- other statistics

CONSTRAINT [PK_Scores] PRIMARY KEY CLUSTERED ([Id] ASC),
CONSTRAINT [FK_Scores_REF_Users] FOREIGN KEY([UserId]) REFERENCES [dbo].[Users] ([Id])
)
GO
CREATE UNIQUE NONCLUSTERED INDEX IX_Scores_User ON dbo.Scores
(
    UserId
)
GO
CREATE NONCLUSTERED INDEX IX_Scores_Rating ON dbo.Scores
(
    MaxLevel desc, BestDiff desc, BestDiffGames desc, 
    BestDiffLastDate desc, MaxLevelLastWinDate desc
)
GO

Each write to Scores table leads to subsequent read like this:

with Ratings (Rating, UserId) as
(
    select (ROW_NUMBER() over 
        (order by MaxLevel desc, BestDiff desc, BestDiffGames desc, 
    BestDiffLastDate desc, MaxLevelLastWinDate desc)) as Rating, 
        UserId
    from Scores with (nolock)
) 
select @Rating = Rating
from Ratings 
where UserId = @UserId

Also there are queries for rating pages using the same ROW_NUMBER. Currently the table Scores contains about 30K rows and when I run the latter query, execution plan looks nice but it has execution duration about 100-200ms! It is unacceptable for several user rating updates per second during peak workloads.

I wonder if there are more efficient ways to organize user ratings?

UPDATE 1: Thanks to Gordon Linoff I made further experiments and the final optimized way to get user rating is to use the query above and the following modified index (non-unique!):

CREATE NONCLUSTERED INDEX IX_Scores_Rating ON dbo.Scores
(
    MaxLevel desc, BestDiff desc, BestDiffGames desc, 
    BestDiffLastDate desc, MaxLevelLastWinDate desc,
    UserId
)
GO

UPDATE 2: Thanks to Mikael Eriksson the following query with top 1 boosted the query speed by 2x even for middle rated users! Top rated users gained up to 8x faster query. These speed improvement numbers were achieved after optimization 1 (index change), so the current execution time dropped to 2-16ms since the initial 100-200ms that is 6-100 times faster than initially!

with Ratings (Rating, UserId) as
(
    select (ROW_NUMBER() over 
        (order by MaxLevel desc, BestDiff desc, BestDiffGames desc, 
    BestDiffLastDate desc, MaxLevelLastWinDate desc)) as Rating, 
        UserId
    from Scores with (nolock)
) 
select top 1 @Rating = Rating
from Ratings 
where UserId = @UserId

Upvotes: 1

Views: 752

Answers (1)

Gordon Linoff
Gordon Linoff

Reputation: 1269853

100-200 ms doesn't seem that bad.

If you had just one column for the ratings, then you might be able to do:

select @Rating = 1 + count(*)
from scores s cross join
     (select * from scores s where userId = @UserId) su
where s.score > su.score;

This isn't exactly the same if you have ties; it is equivalent to a rank() rather than a row_number(), so it handles ties differently. If you could get the columns into a single column with an index, this should be fast.

You can do the same thing with multiple columns, but the logic gets complicated, and I'm not 100% sure the index will always get used properly. Something like:

where s.score > su.score or
      (s.score = su.score and s.bestdiff > su.bestdif) or
      (s.score = su.score and s.bestdiff =  su.bestdif and s.BestDiffGames > su.BestDiffGames) or
      (s.score = su.score and s.bestdiff =  su.bestdif and s.BestDiffGames = su.BestDiffGames and s.MaxLevelLastWinDate > su.MaxLevelLastWinDate)

Upvotes: 1

Related Questions