Reputation: 1795

In Pandas how can I use the values in one table as an index to extract data from another table?

I feel like this should be really simple but I'm having a hard time with it. Suppose I have this:

df1:

ticker  hhmm <--- The hhmm value corresponds to the column in df2
======  ====
AAPL    0931
IBM     0930
XRX     1559

df2:

ticker  0930  0931  0932 ... 1559   <<---- 390 columns
======  ====  ====  ==== ... ====
AAPL    4.56  4.57  ...      ...     
IBM     7.98  ...   ...      ...
XRX     3.33  ...   ...      3.78

The goal is to create a new column in df1 whose value is df2[df1['hhmm']].

For example:

df1:

ticker  hhmm  df2val
======  ====  ======
AAPL    0931    4.57
IBM     0930    7.98
XRX     1559    3.78

Both df's have 'ticker' as their index, so I could simply join them BUT assume that this uses too much memory (the dataframes I'm using are much larger than the examples shown here).

I've tried apply and it's slooooow (15 minutes to run).

What's the Pandas Way to do this? Thanks!

Upvotes: 3

Answers (3)

Vaishali

Reputation: 38415

Try

df2.set_index('ticker').stack().loc[df1.apply(tuple, axis = 1)]

ticker      
AAPL    931     4.57
IBM     930     7.98
XRX     1559    3.78

Upvotes: 1

BENY

Reputation: 323326

There is a function called lookup

df1['val']=df2.set_index('ticker').lookup(df1.ticker,df1.hhmm)
df1
Out[290]: 
  ticker  hhmm    val
0   AAPL  0931   4.57
1    IBM  0930   7.98
2    XRX  1559  33.00# I make up this number

Upvotes: 1

YOLO

Reputation: 21739

Here's a minimal example of what you are trying to do. Hope this gives you enough hint:

# sample data
df1 = pd.DataFrame({'ticker':['AAPL','IBM','XRX'], 'hhmm':['0931','0930','1559']})

df2 = pd.DataFrame({'ticker':['AAPL','IBM','XRX'],
                    '0931': [2,2,3],
                    '0930': [5,6,7],
                    '1559': [8,7,6]})

# melt to match the format to join
df2 = pd.melt(df2, id_vars='ticker',var_name='hhmm',value_name='df2val')

# join to df1
df1.merge(df2, on=['ticker','hhmm'])

    hhmm    ticker  df2val
0   0931    AAPL    2
1   0930    IBM     6
2   1559    XRX     6

Upvotes: 1

In Pandas how can I use the values in one table as an index to extract data from another table?

Answers (3)

Related Questions