Reputation: 1036
Let's say I have an app that determine the winners in a prize drawing. All entries are entered into a table indicating their employeeID. Each employee can enter the drawing multiple times. I select from the table, order by newid to get a random sort. I assume the more entries (database records) an employee has the better chance he will end up in the top 5 of my query each time I run it. So far so good. However, because each employee has multiple records, there is a good chance he will come up multiple times in the top 5. I need the ability to return 5 unique records from the randomly sorted results.
How do I get 5 unique rows while still ensuring those with multiple drawing entries get a heavier weighting in the selection?
My base query:
SELECT TOP 5 employeeID
FROM events
TABLESAMPLE(1000 ROWS)
ORDER BY CHECKSUM(NEWID());
Kinda what I am trying to do:
SELECT TOP 5 *
FROM events
WHERE employeeID IN (SELECT employeeID
FROM events
TABLESAMPLE(1000 ROWS)
ORDER BY CHECKSUM(NEWID())
)
ORDER BY CHECKSUM(NEWID())
But of course I cannot do an order by in the subquery.
Upvotes: 2
Views: 188
Reputation: 1532
Here's a fairly simple way to get what you're after:
select top 5 EmployeeID
from
(
select EmployeeID, row_number() over (order by newid()) DrawOrder
from Events
) wins
group by EmployeeID
order by min(DrawOrder)
Upvotes: 1
Reputation: 93181
Any solution must take into account 2 things:
Here's my approach:
;WITH
tmp1 AS
(
SELECT EmployeeID,
ROW_NUMBER() OVER (ORDER BY NEWID()) AS SortOrder
FROM Events
),
tmp2 AS
(
SELECT EmployeeID,
MIN(SortOrder) AS WinOrder
FROM tmp1
GROUP BY EmployeeID
)
SELECT TOP 5 *
FROM tmp2
ORDER BY WinOrder
The SQL Fiddle gives employees 1 & 5 higher chances to win, but they will only win once each, regardless of how many times they enter.
Upvotes: 3