Brandon Ginley
Brandon Ginley

Reputation: 269

Python numpy bool masks for absoluting values

Suppose you have a numpy array(n,n) ie.

    x = np.arange(25).reshape(5,5)

and you fill x with random integers between -5 and 5. Is there a method to use a boolean mask so that all of my values which are 0 become 1 and all my numbers which are nonzero become zero?(i.e, if [index]>0 or [index]<0, [index]=0, and if [index]=0 then [index]=1)

I know you could use an iteration to change each element, but my goal is speed and as such I would like to eliminate as many loops as possible from the finalized script.

EDIT: Open to other ideas, as well, of course, as long as speed/efficiency is kept in mind

Upvotes: 2

Views: 365

Answers (4)

Divakar
Divakar

Reputation: 221504

You can use simple comparison with 0 to give us a boolean array and then convert to int datatype with +0 or typecasting with .astype(int). Thus, we would have two approaches.

Approach #1 :

(x==0)+0

Approach #2 :

(x==0).astype(int)

Runtime tests

This section compares the runtimes for the earlier mentioned two approaches and includes the other numpy array based approach that converts x to boolean datatype -

Case #1:

In [36]: x = np.arange(25).reshape(5,5)

In [37]: %timeit (x==0)+0
    ...: %timeit (x==0).astype(int)
    ...: %timeit 1 - x.astype(bool).astype(int)
    ...: 
1000000 loops, best of 3: 1.85 µs per loop
1000000 loops, best of 3: 1.08 µs per loop
1000000 loops, best of 3: 1.51 µs per loop

Case #2:

In [38]: x = np.random.randint(0,50,(10000,10000))

In [39]: %timeit (x==0)+0
    ...: %timeit (x==0).astype(int)
    ...: %timeit 1 - x.astype(bool).astype(int)
    ...: 
1 loops, best of 3: 227 ms per loop
10 loops, best of 3: 186 ms per loop
1 loops, best of 3: 319 ms per loop

It seems (x==0).astype(int) performs quite well!

Upvotes: 2

Praveen
Praveen

Reputation: 7222

Firstly, you could instantiate your array directly using np.random.randint:

# Note: the lower limit is inclusive while the upper limit is exclusive
x = np.random.randint(-5, 6, size=(5, 5))

To actually get the job done, perhaps type-cast to bool, type-cast back, and then negate?

res = 1 - x.astype(bool).astype(int)

Alternatively, you could be a bit more explicit:

x[x != 0] = 1
res = 1 - x

But the second method seems to take more than twice as much time:

>>> n = 1000
>>> a = np.random.randint(-5, 6, (n, n))
>>> %timeit a.astype(bool).astype(int)
1000 loops, best of 3: 1.58 ms per loop
>>> %timeit a[a != 0] = 1
100 loops, best of 3: 4.61 ms per loop

Upvotes: 2

maxymoo
maxymoo

Reputation: 36545

First of all you don't need to use reshape, you can create your random matrix directly like this:

M = np.random.randint(-5,5,(2,2))

Then if you want to do your substitutions you can just do your indexing like this:

M[M==1]=10
M[M==0]=1
M[M==10]=0

Upvotes: 0

DG1
DG1

Reputation: 171

You could use a list comprehension for this...

bool_x = [0 if y != 0 else 1 for y in x.reshape(25,1)]

If you are going for speed though consider if you really need the array to be 5x5, then converted, or if you could np.arange(25), apply the list comprehension directly and then reshape. All that reshaping will cost you something surely.

Upvotes: 0

Related Questions