connor449
connor449

Reputation: 1679

How to use vector lookup by index in pandas?

I have a df that looks like this:

    time                a                b
0  0.000                6                5
1  0.008                6                9
2  0.016                1                9
3  0.024                2                7
4  0.032                1                5

I want to use each value from df.a and df.b as an index in the vector below:

x =  [-6, -4, -3, -2, -1, 0.5, 1, 2, 4, 6]

The result create 2 new columns, df.a_ and df.b_ which show the value from x where df.a or df.b index it. Also, I want to subtract 1 from each df.a and df.b value when indexing to account for the 0th element. It should look like this:

    time                a                b          a_      b_
0  0.000                6                5         0.5     -1
1  0.008                6                9         0.5      4
2  0.016                1                9         -6       4
3  0.024                2                7         -4       2
4  0.032                1                5         -6       -1

Upvotes: 2

Views: 199

Answers (5)

Andy L.
Andy L.

Reputation: 25259

I also use numpy fancy index, but I would do it for all columns and construct a dict to use with assign. Just a different way to assign multiple columns at once.

a = np.array(x)
cols = ['a', 'b']
d = dict(zip(np.char.add(cols, '_'), a[df[cols] - 1].T))
df = df.assign(**d)

Out[721]:
    time  a  b   a_   b_
0  0.000  6  5  0.5 -1.0
1  0.008  6  9  0.5  4.0
2  0.016  1  9 -6.0  4.0
3  0.024  2  7 -4.0  1.0
4  0.032  1  5 -6.0 -1.0

Upvotes: 1

BENY
BENY

Reputation: 323326

IIUC

d=dict(zip(range(1,len(x)+1),x))
df=pd.concat([df,df[['a','b']].applymap(lambda x : d.get(x)).add_suffix('_')],axis=1)
Out[15]: 
    time  a  b   a_  b_
0  0.000  6  5  0.5  -1
1  0.008  6  9  0.5   4
2  0.016  1  9 -6.0   4
3  0.024  2  7 -4.0   1
4  0.032  1  5 -6.0  -1

Upvotes: 0

Quang Hoang
Quang Hoang

Reputation: 150785

IIUC, this can be solved with numpy indexing:

x =  [-6, -4, -3, -2, -1, 0.5, 1, 2, 4, 6]
x = np.array(x)

df['a_'] = x[df['a']-1]
df['b_'] = x[df['b']-1]

# if you have more than two columns:
# for col in df.columns[1:]:
#     df[col+'_'] = x[df[col] - 1]

Output:

    time  a  b   a_   b_
0  0.000  6  5  0.5 -1.0
1  0.008  6  9  0.5  4.0
2  0.016  1  9 -6.0  4.0
3  0.024  2  7 -4.0  1.0
4  0.032  1  5 -6.0 -1.0

Upvotes: 1

James
James

Reputation: 36691

It looks like your indexing of x is off by 1. But here is a quick way to to do it using apply.

df['a_'] = df.a.apply(lambda r: x[r-1])
df['b_'] = df.b.apply(lambda r: x[r-1])

df
# returns:
    time  a  b   a_  b_
0  0.000  6  5  0.5  -1
1  0.008  6  9  0.5   4
2  0.016  1  9 -6.0   4
3  0.024  2  7 -4.0   1
4  0.032  1  5 -6.0  -1

Upvotes: 1

It_is_Chris
It_is_Chris

Reputation: 14113

one option is apply with lambda

df['a_'] = df['a'].apply(lambda y: x[y])
df['b_'] = df['b'].apply(lambda y: x[y])

    time  a  b  a_   b_
0  0.000  6  5   1  0.5
1  0.008  6  9   1  6.0
2  0.016  1  9  -4  6.0
3  0.024  2  7  -3  2.0
4  0.032  1  5  -4  0.5

Upvotes: 0

Related Questions