MeC
MeC

Reputation: 463

How to probabilistically populate a list in python?

I want to use a basic for loop to populate a list of values in Python but I would like the values to be calculate probabilistically such that p% of the time the values are calculated in (toy) equation 1 and 100-p% of the time the values are calculated in equation 2.

Here's what I've got so far:

    # generate list of random probabilities 
    p_list = np.random.uniform(low=0.0, high=1.0, size=(500,))
    my_list = []

    # loop through but where to put 'p'? append() should probably only appear once
    for p in p_list:
        calc1 = x*y # equation 1
        calc2 = (x-y) # equation 2
        my_list.append(calc1)
        my_list.append(calc2)

Upvotes: 4

Views: 280

Answers (4)

vRathee
vRathee

Reputation: 11

If you are ok to use numpy worth trying the choice method.

https://docs.scipy.org/doc/numpy-1.14.1/reference/generated/numpy.random.choice.html

Upvotes: 0

Green Cloak Guy
Green Cloak Guy

Reputation: 24691

You've already generated a list of probabilities - p_list - that correspond to each value in my_list you want to generate. The pythonic way to do so is via a a ternary operator and a list comprehension:

import random
my_list = [(x*y if random() < p else x-y) for p in p_list]

If we were to expand this into a proper for loop:

my_list = []
for p in p_list:
    if random() < p:
        my_list.append(x*y)
    else:
        my_list.append(x-y)

If we wanted to be even more pythonic, regarding calc1 and calc2, we could make them into lambdas:

calc1 = lambda x,y: x*y
calc2 = lambda x,y: x-y
...
my_list = [calc1(x,y) if random() < p else calc2(x,y) for p in p_list]

or, depending on how x and y vary for your function (assuming they're not static), you could even do the comprehension in two steps:

calc_list = [calc1 if random() < p else calc2 for p in p_list]
my_list = [calc(x,y) for calc in calc_list]

Upvotes: 2

Grismar
Grismar

Reputation: 31319

The other answers seem to assume you want to keep the calculated chances around. If all you are after is a list of results for which equation 1 was used p% of the time and equation 2 100-p% of the time, this is all you need:

from random import random, seed

inputs = [1, 2, 3, 4, 5, 6, 7, 8, 9, 10]

# change the seed to see different 'random' outcomes
seed(1)
results = [x * x if random() > 0.5 else 2 * x for x in inputs]

print(results)

Upvotes: 0

Greg Dubicki
Greg Dubicki

Reputation: 6940

I took approach of minimal changes to the original code and easy to understand syntax:

import numpy as np

p_list = np.random.uniform(low=0.0, high=1.0, size=(500,))

my_list = []

# uncomment below 2 lines to make this code syntactially correct
#x = 1
#y = 2

for p in p_list:
        # randoms are uniformly distributed over the half-open interval [low, high)
        # so check if p is in [0, 0.5) for equation 1 or [0.5, 1) for equation 2
        if p < 0.5:
                calc1 = x*y # equation 1
                my_list.append(calc1)
        else:
                calc2 = (x-y) # equation 2
                my_list.append(calc2)

Upvotes: 1

Related Questions