Reputation: 8602

Python: Sample N random items from list with weights but without repetition

I am creating a type of lottery system where individuals (identifier by unique id) can have multiple tickets into a lottery however once they are picked, they cannot be selected to win again.

Here is my example:

import random
entrants = ['John', 'Jane', 'Cthulhu']
allEntries = []
for entrant in entrants:
    numEntries = random.randint(1, 5)
    print("%s has %d entries" % (entrant, numEntries))
    allEntries.extend([entrant] * numEntries)
print(random.sample(allEntries, k=2))

My idea was to make a list that has entrant's name numEntries times and then select from there. However sometimes the same individual is picked as both winners. Is there a way to have weights for each entrant?

I tried using random.choices() with weights but this can also select the same individual as both winners.

import random
weights = []
for entrant in entrants:
    numEntries = random.randint(1, 5)
    print("%s has %d entries" % (entrant, numEntries))
    weights.extend([numEntries])
print(random.choices(entrants, weights=weights, k=2))

Upvotes: 3

Answers (3)

pfalz-benni

Reputation: 23

The simplest way to do what you want is using NumPy. The following function does it all:

numpy.random.choice(a, size=None, replace=True, p=None)

a: array-like object (e.g. list) you want to select from

size: number of elements to select

replace: indicates whether it is it allowed to select the same item multiple times - in your case False

p: array-like object (e.g. list) with the probabilities for the elements in a (same order)

Reference: https://numpy.org/doc/stable/reference/random/generated/numpy.random.choice.html

Upvotes: 2

Caleth

Reputation: 62719

The normal method to select randomly without repetition is to shuffle the entries and take the first N.

from random import shuffle

N = 1
entrants = ['John', 'Jane', 'Cthulhu']
shuffle(entrants)
print(entrants[:N])

Or more directly

from random import sample

N = 1
entrants = ['John', 'Jane', 'Cthulhu']
print(sample(entrants, N))

However your requirement of weighted sampling means you'll need more than that.

def unique_sample(population, count):
  shuffle(population)
  unique = set()
  it = iter(population)
  while len(unique) < count:
    elem = next(it)
    if elem not in unique:
      yield elem
    unique.add(elem)

Upvotes: 1

Bijan

Reputation: 8602

I liked my solution of multiplying the entrant's name by the number of entries. The problem with my solutions was that if I selected a winner, they were still in the pool.

import random

def selectWinners(allEntries, numWinners):
    winners = []
    print("Selecting %d winners" % numWinners)
    print("Entries", allEntries)
    for i in range(numWinners):
        winner = random.choice(allEntries)
        print("%d: %s won" % (i+1, winner))
        allEntries[:] = [x for x in allEntries if x != winner]

entrants = ['John', 'Jane', 'Cthulhu']
allEntries = []
for entrant in entrants:
    numEntries = random.randint(1, 5)
    print("%s has %d entries" % (entrant, numEntries))
    allEntries.extend([entrant] * numEntries)
selectWinners(allEntries, 2)

returns an output like:

John has 1 entries
Jane has 1 entries
Cthulhu has 5 entries
Selecting 2 winners
Entries ['John', 'Jane', 'Cthulhu', 'Cthulhu', 'Cthulhu', 'Cthulhu', 'Cthulhu']
1: Cthulhu won
2: Jane won

Upvotes: 0

Python: Sample N random items from list with weights but without repetition

Answers (3)

Related Questions