alec_djinn
alec_djinn

Reputation: 10789

Fastest way of generating numpy arrays or randomly distributed 0s and 1s

I need to generate masks for dropout for a specific neural network. I am looking at the fastest way possible to achieve this using numpy (CPU only).

I have tried:

def gen_mask_1(size, p=0.75):
    return np.random.binomial(1, p, size)


def gen_mask_2(size, p=0.75):
    mask = np.random.rand(size)
    mask[mask>p]=0
    mask[mask!=0]=1
    return mask

where p is the probability of having 1

The speed of these two approaches is comparable.

%timeit gen_mask_1(size=2048)
%timeit gen_mask_2(size=2048)

45.9 µs ± 575 ns per loop (mean ± std. dev. of 7 runs, 10000 loops each)
47.4 µs ± 372 ns per loop (mean ± std. dev. of 7 runs, 10000 loops each)

Are there faster methods?

UPDATE

Following the suggestions got so far, I have tested a few extra implementations. I couldn't get @njit to work when setting parallel=True (TypingError: Failed in nopython mode pipeline (step: convert to parfors)), it works without but, I think, less efficiently. I have found a python binding for Intel's mlk_random (thank you @MatthieuBrucher for the tip!) here: https://github.com/IntelPython/mkl_random So far, using mlk_random together with @nxpnsv's approach gives the best result.

@njit
def gen_mask_3(size, p=0.75):
    mask = np.random.rand(size)
    mask[mask>p]=0
    mask[mask!=0]=1
    return mask

def gen_mask_4(size, p=0.75):
    return (np.random.rand(size) < p).astype(int)

def gen_mask_5(size):
    return np.random.choice([0, 1, 1, 1], size=size)

def gen_mask_6(size, p=0.75):
    return (mkl_random.rand(size) < p).astype(int)

def gen_mask_7(size):
    return mkl_random.choice([0, 1, 1, 1], size=size)

%timeit gen_mask_4(size=2048)
%timeit gen_mask_5(size=2048)
%timeit gen_mask_6(size=2048)
%timeit gen_mask_7(size=2048)

22.2 µs ± 145 ns per loop (mean ± std. dev. of 7 runs, 10000 loops each)
25.8 µs ± 336 ns per loop (mean ± std. dev. of 7 runs, 10000 loops each)
7.64 µs ± 133 ns per loop (mean ± std. dev. of 7 runs, 100000 loops each)
29.6 µs ± 1.18 µs per loop (mean ± std. dev. of 7 runs, 10000 loops each)

Upvotes: 1

Views: 335

Answers (3)

nxpnsv
nxpnsv

Reputation: 173

As I said in the comment the question the implementation

def gen_mask_2(size, p=0.75):
    mask = np.random.rand(size)
    mask[mask>p]=0
    mask[mask!=0]=1
    return mask

can be improved, by using that comparison gives an bool which then can be converted to int. This removes the two comparisons with masked assignments you otherwise have, and it makes for a pretty one liner :)

def gen_mask_2(size, p=0.75):
    return = (np.random.rand(size) < p).astype(int)

Upvotes: 1

Warren Weckesser
Warren Weckesser

Reputation: 114831

Another option is numpy.random.choice, with an input of 0s and 1s where the proportion of 1s is p. For example, for p = 0.75, use np.random.choice([0, 1, 1, 1], size=n):

In [303]: np.random.choice([0, 1, 1, 1], size=16)
Out[303]: array([1, 0, 1, 0, 0, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 0])

This is faster than using np.random.binomial:

In [304]: %timeit np.random.choice([0, 1, 1, 1], size=10000)
71.8 µs ± 368 ns per loop (mean ± std. dev. of 7 runs, 10000 loops each)

In [305]: %timeit np.random.binomial(1, 0.75, 10000)
174 µs ± 348 ns per loop (mean ± std. dev. of 7 runs, 10000 loops each)

To handle an arbitrary value for p, you can use the p option of np.random.choice, but then the code is slower than np.random.binomial:

In [308]: np.random.choice([0, 1], p=[0.25, 0.75], size=16)
Out[308]: array([1, 1, 1, 1, 1, 1, 1, 1, 1, 0, 1, 0, 1, 1, 0, 0])

In [309]: %timeit np.random.choice([0, 1], p=[0.25, 0.75], size=10000)
227 µs ± 781 ns per loop (mean ± std. dev. of 7 runs, 1000 loops each)

Upvotes: 2

Sheldore
Sheldore

Reputation: 39052

You can make use of Numba compiler and make things faster by applying njit decorator on your functions. Below is an example for a very large size

from numba import njit

def gen_mask_1(size, p=0.75):
    return np.random.binomial(1, p, size)

@njit(parallel=True)
def gen_mask_2(size, p=0.75):
    mask = np.random.rand(size)
    mask[mask>p]=0
    mask[mask!=0]=1
    return mask

%timeit gen_mask_1(size=100000)
%timeit gen_mask_2(size=100000)

2.33 ms ± 215 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)
512 µs ± 25.1 µs per loop (mean ± std. dev. of 7 runs, 1 loop each)

Upvotes: 2

Related Questions