Select top records with unique ids

Question

I have a query like this:

SELECT TOP(1000) UserId, Key, Value 
FROM MyTable 
WHERE Processed = 'false' ORDER BY TimeStamp"

I am iterating over the results and importing the data to an external system. This is rather slow, so I would like to run those operations in parallel for all 1000 items in the result set. This is only possible if I have no duplicate UserIds in the result set.

I have tried doing the filtering on the client, but I end up with a set that has maybe 400 members from the first 1000. This results in a lot of data being sent over the wire multiple times over different iterations.

Is there a way to structure my query so that it will return the first 1000 rows, taking only one row per UserId? It may return slightly newer records, but I want there to be exactly 1000 unique userIds in the result set. Is there a syntax for this?

RichardTheKiwi · Accepted Answer

SELECT TOP(1000) UserId, [Key], Value 
FROM
 (SELECT *, RN = ROW_NUMBER() OVER (PARTITION BY UserId order by TimeStamp)
  FROM MyTable 
  WHERE Processed = 'false') X
WHERE RN=1
ORDER BY TimeStamp;

Select top records with unique ids

Answers (1)

Related Questions