Reputation: 51
I have a small numpy problem where I am trying to replace the values from A with values of B. See MWE:
t = np.arange(65,70,2)
>>array([65, 67, 69])
b = np.random.randint(3,size=20)
>>array([2, 2, 1, 2, 1, 2, 2, 1, 1, 0, 2, 2, 2, 2, 2, 0, 2, 0, 2, 2])
Now b
should be masked by t
so that 2 correspond to the third element of t
and 0 to the first element of t
.
Whats the most efficient way to do this with numpy?
Upvotes: 0
Views: 60
Reputation: 5939
Way 1
For a simple use you can just replace items of b
like this:
for i in range(3):
b[b==i] = t[i]
It's quite fair but not efficient, especially if you play with huge ranges of indexes.
Way 2
If you would like to optimize it, you need to use grouping like I have discussed in this post. Borrowing from answer of Divakar, numpy
- only solution requires some deeper understanding:
b = np.array([2, 2, 1, 2, 1, 2, 2, 1, 1, 0, 2, 2, 2, 2, 2, 0, 2, 0, 2, 2])
sidx = np.argsort(b) #indices that would sort an array
bs = b[sidx] #sorted array
# locations of changings in sorted array, [ 0, 3, 7, 20] in this case:
split_idx = np.flatnonzero(np.r_[True,bs[:-1]!=bs[1:],True])
indices = [sidx[i:j] for (i,j) in zip(split_idx[:-1], split_idx[1:])]
Indices is list of arrays [ 9, 17, 15]
, [2, 4, 7, 8]
, [16, 14, 13, 12, 0, 10, 18, 6, 5, 3, 1, 11, 19]
which is equivalent to b==0
, b==1
, b==2
, so you can use them now like so:
for i in range(len(indices)):
b[indices[i]] = t[i]
Way 3
This is the most efficient way I've found but numpy
is not enough here:
import pandas as pd
indices = pd.DataFrame(b).groupby([0]).indices.values()
for i in range(len(indices)):
b[indices[i]] = t[i]
Upvotes: 0
Reputation: 653
You can use a list comprehension for this:
[t[b_elem] for b_elem in b]
Upvotes: 1