Kaynef21
Kaynef21

Reputation: 51

most efficient way to set dataframe column indexing to other columns

I have a large Dataframe. One of my columns contains the name of others. I want to eval this colum and set in each row the value of the referenced column:

|A|B|C|Column|
|:|:|:|:-----|
|1|3|4|  B   |
|2|5|3|  A   |
|3|5|9|  C   |

Desired output:

|A|B|C|Column|
|:|:|:|:-----|
|1|3|4|  3   |
|2|5|3|  2   |
|3|5|9|  9   |

I am achieving this result using:

df.apply(lambda d: eval("d." + d['Column']), axis=1)

But it is very slow, even using swifter. Is there a more efficient way of performing this?

Upvotes: 1

Views: 156

Answers (3)

Mayank Porwal
Mayank Porwal

Reputation: 34076

For better performance, use df.to_numpy():

In [365]: df['Column'] = df.to_numpy()[df.index, df.columns.get_indexer(df.Column)]

In [366]: df
Out[366]: 
   A  B  C Column
0  1  3  4      3
1  2  5  3      2
2  3  5  9      9

Upvotes: 1

BENY
BENY

Reputation: 323306

Since lookup is going to decprecated try numpy method with get_indexer

df['new'] = df.values[df.index,df.columns.get_indexer(df.Column)]
df
Out[75]: 
   A  B  C Column new
0  1  3  4      B   3
1  2  5  3      A   2
2  3  5  9      C   9

Upvotes: 0

Quang Hoang
Quang Hoang

Reputation: 150765

For Pandas < 1.2.0, use lookup:

df['Column'] = df.lookup(df.index, df['Column'])

From 1.2.0+, lookup is decprecated, you can just use a for loop:

df['Column'] = [df.at[idx, r['Column']] for idx, r in df.iterrows()]

Output:

   A  B  C  Column
0  1  3  4       3
1  2  5  3       2
2  3  5  9       9

Upvotes: 0

Related Questions