Connie DeCinko
Connie DeCinko

Reputation: 1036

Unique Top 5 Random Query

Let's say I have an app that determine the winners in a prize drawing. All entries are entered into a table indicating their employeeID. Each employee can enter the drawing multiple times. I select from the table, order by newid to get a random sort. I assume the more entries (database records) an employee has the better chance he will end up in the top 5 of my query each time I run it. So far so good. However, because each employee has multiple records, there is a good chance he will come up multiple times in the top 5. I need the ability to return 5 unique records from the randomly sorted results.

How do I get 5 unique rows while still ensuring those with multiple drawing entries get a heavier weighting in the selection?

My base query:

SELECT TOP 5 employeeID
 FROM events 
   TABLESAMPLE(1000 ROWS) 
 ORDER BY CHECKSUM(NEWID());

Kinda what I am trying to do:

SELECT TOP 5 *
    FROM events 
    WHERE employeeID IN (SELECT employeeID
     FROM events 
       TABLESAMPLE(1000 ROWS) 
     ORDER BY CHECKSUM(NEWID())
     )
 ORDER BY CHECKSUM(NEWID())

But of course I cannot do an order by in the subquery.

Upvotes: 2

Views: 188

Answers (2)

Jace
Jace

Reputation: 1532

Here's a fairly simple way to get what you're after:

select top 5 EmployeeID
from
(
    select EmployeeID, row_number() over (order by newid()) DrawOrder
    from Events
) wins
group by EmployeeID
order by min(DrawOrder)

Upvotes: 1

Code Different
Code Different

Reputation: 93181

Any solution must take into account 2 things:

  • If an employee enter multiple tickets, his chance of winning increases relative to other.
  • Everyone can only win once

Here's my approach:

;WITH
    tmp1 AS
    (
        SELECT      EmployeeID,
                    ROW_NUMBER() OVER (ORDER BY NEWID()) AS SortOrder
        FROM        Events
    ),
    tmp2 AS
    (
        SELECT      EmployeeID,
                    MIN(SortOrder) AS WinOrder
        FROM        tmp1
        GROUP BY    EmployeeID
    )


SELECT TOP 5 *
FROM tmp2
ORDER BY WinOrder

The SQL Fiddle gives employees 1 & 5 higher chances to win, but they will only win once each, regardless of how many times they enter.

Upvotes: 3

Related Questions