Reputation: 639
I have the following table
MyTable
ID
MessageType
MessageDate
MessageBody
The table is a few million rows but there are only 100 unique MessageType in it.
What I need is a sample of each MessageType (must include at least MessageType and MessageBody), but I can't do a DISTINCT
as that only gets me the MessageType column.
I am thinking something like
SELECT TOP 5 *
FROM MyTable
WHERE MessageType IN (SELECT DISTINCT MessageType FROM MyTable)
I know this doesn't work as it just me the top 5, but I am not sure how to make SQL loop through this.
Thanks for any help
Upvotes: 3
Views: 2104
Reputation: 453920
The Row_Number version
;WITH cte AS
(
SELECT ID,
MessageType,
MessageDate,
MessageBody,
ROW_NUMBER() OVER (PARTITION BY MessageType ORDER BY (SELECT 0)) AS RN
FROM MyTable
)
SELECT ID,
MessageType,
MessageDate,
MessageBody
FROM cte
WHERE RN <=5
The CROSS APPLY
version
WITH m1 AS
(
SELECT DISTINCT MessageType
FROM MyTable
)
SELECT m2.*
FROM m1
CROSS APPLY
(
SELECT TOP 5 *
FROM MyTable m2
WHERE m2.MessageType = m1.MessageType
) m2
Upvotes: 2
Reputation: 21
Martin, if I'm reading your answer correctly, I think what you will produce is 5 samples of each message. Marc_s just wants one sample from each message.
I think what you need is:
SELECT ID,
MessageType,
MessageDate
FROM (
SELECT ID,
MessageType,
MessageDate,
ROW_NUMBER() OVER (PARTITION BY MessageType, ORDER BY NEWID() ) AS RN
-- I am using NewID() because it will produce a nice random sampling,
-- but Mark's SELECT(0) will be faster.
FROM MyTable
) sampling
WHERE RN =1
Upvotes: 2