Reputation: 970
I want to sample from a database table by these condition: my sample should have at least 100 sample in every hour of every day from specific category column . table is as below :
id recieved_time catgory
1 2022-08-01 4:05 blue
1 2022-08-01 5:05 red
1 2022-08-01 7:05 red
...
can anyone help me to write an efficient sql query ? (by the way i am using clickhouse if that help)
Upvotes: 0
Views: 130
Reputation: 525
Try ROW_NUMBER()
without an ORDER BY condition if you just want a random sampling of 100 per category:
SELECT
*
FROM
(
SELECT
*,
ROW_NUMBER() OVER (
PARTITION BY
category,
DATE_TRUNC('hour', received_time)
ORDER BY
(SELECT NULL) -- change if you care
) AS category_rownumber
FROM
TableName
) X
WHERE
category_rownumber <= 100
ORDER BY
category,
received_time DESC
;
Upvotes: 1