Link_tester
Link_tester

Reputation: 1081

Replacing values in one array with specific values of another array

I have two big numpy arrays or pandas dataframes, eg:

a=[[1, 10, 20, 30],[2, 50, 14, -10],[3, 11, 2, 0], ...] 

b=[[10, 40, 30, 1, 1, 2],[0, 11, -1, 32, 3, 2],[9, 2, 51, -2, 3, 2], ...]

I want to replace last two columns of the matrix b with values of a. I want to say when in the last two columns of a, we have 1, replace with the row in the a which contains 1 as the first column of a. this column is a counter from 1 to end. In fact at the end the columns of matrix b will be increased from 6 to 10.

So, the new b will be something like:

b=[[10, 40, 30, 1, 10, 20, 30, 50, 14, -10],[0, 11, -1, 32, 11, 2, 0, 50, 14, -10],[9, 2, 51, -2, 10, 20, 30, 11, 2, 0], ...]

I appreciate any solution to handle this request with the data either as numpy arrays or pandas.

Upvotes: 0

Views: 157

Answers (2)

mathfux
mathfux

Reputation: 5949

Assuming first column of a is of the form [1, 2, 3...] it can be done with this one-liner:

np.c_[b[:,:-2], a[b[:,-2]-1, 1:], a[b[:,-1]-1, 1:]]

In fact, this is more convenient to replace a with a[:, 1:], it can be simplified then like so:

np.c_[b[:,:-2], a[b[:,-2]-1], a[b[:,-1]-1]]

The last two columns of b were converted to indices of a. In case first column of a is different than [1, 2, 3...], subtracting one is not enough and you need to think of different way how to map last two columns of b to indices with respect to a. I leave it out of scope.

Upvotes: 1

Rory O'Connell
Rory O'Connell

Reputation: 76

Two suggestions.

  1. If these are in pandas dataframes, you can join the 'a' dataframe to the 'b' dataframe twice, based on column b.5 = a0.1 and b.6 = a1.1. Then read off the columns you need (b.1-4, a0.2-4, a1.2-4. Something like:

    new1 = pd.merge(b, a, left_on='5', right_on='1')
    new2 = pd.merge(new1, a, left_on='6', right_on='1')
    

Then drop columns 5 and 6

  1. otherwise would suggest turning 'a' to a different structure, a list of tuples or a dictionary. Your index is embedded as the first value, so if you went the dictionary rout you would try to get {1:[10, 20, 30], 2:[50, 14, -10], 3:[11, 2, 0] ... } and that makes the lookup easier.

    newlist = []
    for x in b:
        q = x[:4]
        q.extend(a[x[4]])
        q.extend(a[x[5]])
        newlist.append(q)
    

Upvotes: 0

Related Questions