How can a numpy array replaced by another numpy array with other dimensions?

Question

I have a small numpy problem where I am trying to replace the values from A with values of B. See MWE:

t = np.arange(65,70,2)
>>array([65, 67, 69])

b = np.random.randint(3,size=20)
>>array([2, 2, 1, 2, 1, 2, 2, 1, 1, 0, 2, 2, 2, 2, 2, 0, 2, 0, 2, 2])

Now b should be masked by t so that 2 correspond to the third element of t and 0 to the first element of t.

Whats the most efficient way to do this with numpy?

mathfux · Accepted Answer

Way 1

For a simple use you can just replace items of b like this:

for i in range(3):
    b[b==i] = t[i]

It's quite fair but not efficient, especially if you play with huge ranges of indexes.

Way 2

If you would like to optimize it, you need to use grouping like I have discussed in this post. Borrowing from answer of Divakar, numpy- only solution requires some deeper understanding:

b = np.array([2, 2, 1, 2, 1, 2, 2, 1, 1, 0, 2, 2, 2, 2, 2, 0, 2, 0, 2, 2])
sidx = np.argsort(b) #indices that would sort an array
bs = b[sidx] #sorted array
# locations of changings in sorted array, [ 0,  3,  7, 20] in this case:
split_idx = np.flatnonzero(np.r_[True,bs[:-1]!=bs[1:],True])
indices = [sidx[i:j] for (i,j) in zip(split_idx[:-1], split_idx[1:])]

Indices is list of arrays [ 9, 17, 15], [2, 4, 7, 8], [16, 14, 13, 12, 0, 10, 18, 6, 5, 3, 1, 11, 19] which is equivalent to b==0, b==1, b==2, so you can use them now like so:

for i in range(len(indices)):
    b[indices[i]] = t[i]

Way 3

This is the most efficient way I've found but numpy is not enough here:

import pandas as pd
indices = pd.DataFrame(b).groupby([0]).indices.values()
for i in range(len(indices)):
    b[indices[i]] = t[i]

How can a numpy array replaced by another numpy array with other dimensions?

Answers (2)

Related Questions