Reputation: 33
I know that using Python's random.choices I can do this:
import random
array_probabilities = [0.5 for _ in range(4)]
print(array_probabilities) # [0.5, 0.5, 0.5, 0.5]
a = [random.choices([0, 1], weights=[1 - probability, probability])[0] for probability in array_probabilities]
print(a) # [1, 1, 1, 0]
How to make an numpy array of 0 and 1 based on a probability array?
Using random.choices is fast, but I know numpy is even faster. I would like to know how to write the same code but using numpy. I'm just getting started with numpy and would appreciate your feedback.
Upvotes: 1
Views: 3044
Reputation: 201
Anwsering an old question ... This could be what you're looking for?
p1 = 0.5
np.random.choice([0,1], p=[1-p1, p1], size=4)
You could select de p Array in the way you want, for example p = [0.5 for _ in range(2)] the range must have the same len than values.
Upvotes: 0
Reputation: 99
Your question got me wondering so I wrote a basic function to compare their timings. And it seems you are right! Timings change but only a little. Here you can see the code below and the output.
import numpy as np
import time
import random
def stack_question():
start=time.time()*1000
array_probabilities = [0.5 for _ in range(4)]
a = [random.choices([0, 1], weights=[1 - probability, probability])[0] for probability in array_probabilities]
end=time.time()
return (start-end)
def numpy_random_array():
start_time=time.time()*1000
val=np.random.rand(4,1)
end_time=time.time()
return (start_time-end_time)
print("List implementation ",stack_question())
print("Array implementation ",numpy_random_array())
The output:
List implementation 1665476650232.8433
Array implementation 1665476650233.9226
Edit: From geeks4geeks I found the following explanation of why it is faster to use numpy arrays.
NumPy Arrays are faster than Python Lists because of the following reasons:
An array is a collection of homogeneous data-types that are stored in contiguous memory locations. On the other hand, a list in Python is a collection of heterogeneous data types stored in non-contiguous memory locations.
The NumPy package breaks down a task into multiple fragments and then processes all the fragments parallelly.
The NumPy package integrates C, C++, and Fortran codes in Python. These programming languages have very little execution time compared to Python.
Upvotes: 2
Reputation: 2727
probabilities = np.random.rand(1,10)
bools_arr = np.apply_along_axis(lambda x: 1 if x > 0.5 else 0, 1, [probabilities])
Upvotes: 0
Reputation: 260360
One option:
out = (np.random.random(size=len(array_probabilities)) > array_probabilities).astype(int)
Example output:
array([0, 1, 0, 1])
Upvotes: 3