rudivonstaden
rudivonstaden

Reputation: 8025

Select 1 random record per category using sql

I have a single large table conveniently called mytable, with three columns: id, cat_group, cat_other. I would like to select a random element from each cat_group using SQL. How can I do it?

Example data

| id | cat_group | cat_other |
+----+-----------+-----------+
|  1 | red       | fred      |
|  2 | blue      | sarah     |
|  3 | blue      | peter     |
|  4 | green     | spot      |
|  5 | red       | peter     |

Acceptable result

| cat_group | id | cat_other |
+-----------+----+-----------+
| red       |  5 | peter     |
| blue      |  2 | sarah     |
| green     |  4 | spot      |

Also good

| cat_group | id | cat_other |
+-----------+----+-----------+
| red       |  1 | fred      |
| blue      |  3 | peter     |
| green     |  4 | spot      |

I have seen some similar questions on SO, but I get lost in the implementation specifics. I'm using postgresql, but the answer should be able to be generalised across different databases.

Sqlfiddle at http://sqlfiddle.com/#!1/b42c8/2

Upvotes: 1

Views: 438

Answers (1)

Gordon Linoff
Gordon Linoff

Reputation: 1270431

Here is one way, using row_number():

select cat_group, id, cat_other
from (select mytable.*,
             row_number() over (partition by cat_group order by random()) as seqnum
      from mytable
     ) as t
where seqnum = 1

The same idea works in other databases. However, the random number function may differ. For instance, in as SQL Server, rand() returns only one value per query -- as opposed to per row. There you would use newid().

Upvotes: 4

Related Questions