Reputation: 1795
I feel like this should be really simple but I'm having a hard time with it. Suppose I have this:
df1:
ticker hhmm <--- The hhmm value corresponds to the column in df2
====== ====
AAPL 0931
IBM 0930
XRX 1559
df2:
ticker 0930 0931 0932 ... 1559 <<---- 390 columns
====== ==== ==== ==== ... ====
AAPL 4.56 4.57 ... ...
IBM 7.98 ... ... ...
XRX 3.33 ... ... 3.78
The goal is to create a new column in df1 whose value is df2[df1['hhmm']].
For example:
df1:
ticker hhmm df2val
====== ==== ======
AAPL 0931 4.57
IBM 0930 7.98
XRX 1559 3.78
Both df's have 'ticker' as their index, so I could simply join them BUT assume that this uses too much memory (the dataframes I'm using are much larger than the examples shown here).
I've tried apply and it's slooooow (15 minutes to run).
What's the Pandas Way to do this? Thanks!
Upvotes: 3
Views: 1209
Reputation: 38415
Try
df2.set_index('ticker').stack().loc[df1.apply(tuple, axis = 1)]
ticker
AAPL 931 4.57
IBM 930 7.98
XRX 1559 3.78
Upvotes: 1
Reputation: 323326
There is a function called lookup
df1['val']=df2.set_index('ticker').lookup(df1.ticker,df1.hhmm)
df1
Out[290]:
ticker hhmm val
0 AAPL 0931 4.57
1 IBM 0930 7.98
2 XRX 1559 33.00# I make up this number
Upvotes: 1
Reputation: 21739
Here's a minimal example of what you are trying to do. Hope this gives you enough hint:
# sample data
df1 = pd.DataFrame({'ticker':['AAPL','IBM','XRX'], 'hhmm':['0931','0930','1559']})
df2 = pd.DataFrame({'ticker':['AAPL','IBM','XRX'],
'0931': [2,2,3],
'0930': [5,6,7],
'1559': [8,7,6]})
# melt to match the format to join
df2 = pd.melt(df2, id_vars='ticker',var_name='hhmm',value_name='df2val')
# join to df1
df1.merge(df2, on=['ticker','hhmm'])
hhmm ticker df2val
0 0931 AAPL 2
1 0930 IBM 6
2 1559 XRX 6
Upvotes: 1