Randomly select rows under conditions

Question

On my spreadsheet I have the following dataset

ID         pack
1          a
1          b
1          c
2          a
3          c
4          a
4          c

I would like to use a function in excel that selects randomly one of the rows where the ID is not unique (it's not important which one, but I need it to be the only one per ID), otherwise has to repeat the same row.

The result according to example should be as below:

ID         pack
1          b
2          a
3          c
4          c

I tried to add a third column called count (counts the number of times the ID is repeated in the db) and calculating a new field as

IF(**count**=1,1,RANDBETWEEN(0,1))

but in some cases an ID (with multiple packs) gets always 0, in other cases gets always 1.

ID         pack    count    check
1          a       3        1
1          b       3        0       
1          c       3        0
2          a       1        1
3          c       1        1
4          a       2        0
4          c       2        0

Of course, last step of this is a new column with

IF(**check**=1,**pack_name**,0)

Scott Craner · Accepted Answer

With your ID's supplied use this formula:

=INDEX(B:B,AGGREGATE(15,6,ROW($A$2:$A$8)/($A$2:$A$8=D2),RANDBETWEEN(1,COUNTIF(A:A,D2))))

Put in E2 then copy/drag down.

Then hit F9 and it will randomly choose a different output that matches the ID.

Randomly select rows under conditions

Answers (1)

Related Questions