Reputation: 47
I'd like to confirm that
a = [random.choices([0,1],weights=[0.2,0.8],k=1) for i in range(0,10)]
does probabilistically the same thing as
a = random.choices([0,1],weights=[0.2,0.8],k=10)
In particular, I expect both to make 10 independent draws from the set {0,1} with probability 0.2 on 0 and 0.8 on 1. Is this right?
Thanks!
Upvotes: 3
Views: 8718
Reputation: 61930
As others have mentioned, the documentation is clear in regard to this aspect, you can further verified by setting the seed before each call, for example:
import random
random.seed(42)
print([random.choices([0, 1], weights=[0.2, 0.8], k=1)[0] for i in range(0, 10)])
random.seed(42)
print(random.choices([0, 1], weights=[0.2, 0.8], k=10))
Output
[1, 0, 1, 1, 1, 1, 1, 0, 1, 0]
[1, 0, 1, 1, 1, 1, 1, 0, 1, 0]
Furthermore setting just once, does leads to different results, as one might expect:
random.seed(42)
print([random.choices([0, 1], weights=[0.2, 0.8], k=1)[0] for i in range(0, 10)])
print(random.choices([0, 1], weights=[0.2, 0.8], k=10))
Output
[1, 0, 1, 1, 1, 1, 1, 0, 1, 0]
[1, 1, 0, 0, 1, 1, 1, 1, 1, 0]
Upvotes: 2
Reputation: 3534
The documentation seems to indicate the two are probabilistically the same and after running the following experiment:
from collections import defaultdict
import pprint
import random
results1 = defaultdict(int)
results2 = defaultdict(int)
for _ in range(10000):
a = [random.choices([0,1],weights=[0.2,0.8],k=1) for i in range(0,10)]
for sublist in a:
for n in sublist:
results1[n] += 1
for _ in range(10000):
a = random.choices([0,1],weights=[0.2,0.8],k=10)
for n in a:
results2[n] += 1
print('first way 0s: {}'.format(results1[0]))
print('second way 0s: {}'.format(results2[0]))
print('first way 1s: {}'.format(results1[1]))
print('second way 1s: {}'.format(results2[1]))
I am seeing very similar results between the two methods.
Upvotes: 3