vanstrouble
vanstrouble

Reputation: 33

How to make an numpy array of 0 and 1 based on a probability array?

I know that using Python's random.choices I can do this:

import random


array_probabilities = [0.5 for _ in range(4)]
print(array_probabilities)  # [0.5, 0.5, 0.5, 0.5]

a = [random.choices([0, 1], weights=[1 - probability, probability])[0] for probability in array_probabilities]
print(a)  # [1, 1, 1, 0]

How to make an numpy array of 0 and 1 based on a probability array?

Using random.choices is fast, but I know numpy is even faster. I would like to know how to write the same code but using numpy. I'm just getting started with numpy and would appreciate your feedback.

Upvotes: 1

Views: 3044

Answers (4)

Perico
Perico

Reputation: 201

Anwsering an old question ... This could be what you're looking for?

p1 = 0.5

np.random.choice([0,1], p=[1-p1, p1], size=4)

You could select de p Array in the way you want, for example p = [0.5 for _ in range(2)] the range must have the same len than values.

Upvotes: 0

Kutay Kılıç
Kutay Kılıç

Reputation: 99

Your question got me wondering so I wrote a basic function to compare their timings. And it seems you are right! Timings change but only a little. Here you can see the code below and the output.

import numpy as np
import time
import random
def stack_question():
    start=time.time()*1000
    array_probabilities = [0.5 for _ in range(4)]
    a = [random.choices([0, 1], weights=[1 - probability, probability])[0] for probability in array_probabilities]
    end=time.time()
    return (start-end)

def numpy_random_array():
    start_time=time.time()*1000
    val=np.random.rand(4,1)
    end_time=time.time()
    return (start_time-end_time)
print("List implementation  ",stack_question())

print("Array implementation  ",numpy_random_array())

The output:

List implementation   1665476650232.8433
Array implementation   1665476650233.9226

Edit: From geeks4geeks I found the following explanation of why it is faster to use numpy arrays.

NumPy Arrays are faster than Python Lists because of the following reasons:
An array is a collection of homogeneous data-types that are stored in contiguous memory locations. On the other hand, a list in Python is a collection of heterogeneous data types stored in non-contiguous memory locations. The NumPy package breaks down a task into multiple fragments and then processes all the fragments parallelly. The NumPy package integrates C, C++, and Fortran codes in Python. These programming languages have very little execution time compared to Python.

Upvotes: 2

LLaP
LLaP

Reputation: 2727

probabilities = np.random.rand(1,10)
bools_arr = np.apply_along_axis(lambda x: 1 if x > 0.5 else 0, 1, [probabilities])

Upvotes: 0

mozway
mozway

Reputation: 260360

One option:

out = (np.random.random(size=len(array_probabilities)) > array_probabilities).astype(int)

Example output:

array([0, 1, 0, 1])

Upvotes: 3

Related Questions