marshall
marshall

Reputation: 2483

Sample uniformly from multisets

Given the set of integers {1,...,n}, I would like to sample uniformly from the binom{n+k-1}{k} distinct multi-subsets of size k. Is there an efficient way of doing this?

For example, the set {1,2,3} has 6 multi-subsets of size 2. These are {1,2}, {2,3}, {1,3}, {1,1}, {2,2}, {3,3}.

Upvotes: 5

Views: 527

Answers (2)

user2357112
user2357112

Reputation: 281519

Yes. Since you know there are (n+k-1) choose k such multi-subsets, you are probably aware of the stars and bars combinatorial problem whose solution provides that formula. The solution to that problem suggests a sampling procedure to produce multi-subsets: randomly choose a way to place k stars and n-1 bars, then determine how the bars partition the stars into groups:

import random
import collections

stars = set(random.sample(xrange(n+k-1), k))
multiset = collections.Counter()

# Don't hide the bin builtin.
bin_ = 1
for i in xrange(n+k-1):
    if i in stars:
        multiset[bin_] += 1
    else:
        bin_ += 1

This will produce a collections.Counter counting the number of times each number was chosen. I've initialized bin_ = 1 to produce a multi-subset of {1...n}; bin_ = 0 would produce a multi-subset of {0...n-1}.

(Previously, I posted an answer suggesting the use of a multinomial distribution. That is not the right distribution; it gives too little weight to results with repeated elements. Sorry for the error. Since the ways to place k stars and n-1 bars are in direct correspondence with the multi-subsets of {1...n}, this solution should produce a uniform distribution.)

Upvotes: 3

bubble
bubble

Reputation: 1672

Unfortunately I cannot test my solution now, it is based on the pervious one:

import random as rd
#n,k - should be predefined
x=[[j]*k for j in xrange(n)] # [1,1,1,1 - k times, 2,2,2,- k times etc.]
rd.sample(x,k)

Upvotes: 0

Related Questions