Reputation: 427
This question pertains to formulating a query in PostgreSQL
Suppose I have the following table:
record entity docid sentencid
1 A 123 1231
1 A 123 1232
1 A 100 1001
1 C 100 1002
1 B 121 1212
1 B 123 1234
2 B 102 1021
2 D 111 1111
2 D 102 1022
2 E 101 1011
3 C 111 1115
3 C 111 1113
3 C 111 1114
Is there a PostgresSQL
query that I can use to select all rows for n (or less) random groups of entities for each record in this table? Lets say n is 2. So the query should select all rows for record 3 and all rows for any 2 random entity groups of record 1 and 2. The final result should be ordered by accession
, entity
, docid
, sentenceid
.
Here is an example result with n=2:
record entity docid sentencid
1 A 100 1001
1 A 123 1231
1 A 123 1232
1 B 121 1212
1 B 123 1234
2 D 102 1022
2 D 111 1111
2 E 101 1011
3 C 111 1113
3 C 111 1114
3 C 111 1115
assuming that the entities A and B were randomly selected from the set of entities (A,B,C) for record 1 and the entities D and E were randomly selected from the set of entities (B,D,E) for record 2. The n entities should be selected randomly for each record.
I have searched extensively for answers to this question but did not find any working query. Thank you for looking into this!
Upvotes: 1
Views: 316
Reputation: 32392
You can use row_number
with a random()
order to randomly select n
entities per record
group. Then join this to your main table
select * from Table1 t1
join (
select * from (
select record, entity,
row_number() over (partition by record order by random()) rn
from Table1
group by record, entity
) t where rn <= 2
) t2 on t1.record = t2.record and t1.entity = t2.entity
Upvotes: 1