Tartaglia
Tartaglia

Reputation: 1041

Conditional subtraction of Pandas Columns

I have a dataframe with stock returns and I would like to create a new column that contains the difference between that stock return and the return of the sector ETF it belongs to:

dict0 = {'date': ['1/1/2020', '1/1/2020', '1/1/2020', '1/1/2020', '1/1/2020', '1/2/2020', '1/2/2020', '1/2/2020', '1/2/2020',
                  '1/2/2020', '1/3/2020', '1/3/2020', '1/3/2020', '1/3/2020', '1/3/2020'],
         'ticker': ['SPY', 'AAPL', 'AMZN', 'XLK', 'XLY', 'SPY', 'AAPL', 'AMZN', 'XLK', 'XLY', 'SPY', 'AAPL', 'AMZN', 'XLK', 'XLY'],
         'returns': [1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 1, 2, 3, 4, 5],
        'sector': [np.NaN, 'Tech', 'Cons Disc', np.NaN, np.NaN, np.NaN, 'Tech', 'Cons Disc', np.NaN, np.NaN, np.NaN, 'Tech', 'Cons Disc', np.NaN, np.NaN,]}
df = pd.DataFrame(dict0)
df = df.set_index(['date', 'ticker'])

That is for instance, for AAPL on 1/1/2020 the return is 2. Since it belongs to the Tech Sector, the relevant return would be the ETF XLK (I have a dictionary that maps sectors to ETF tickers). The in the new column the return would be AAPL's return of 2 minus the XLK return on that day of 4.

I have asked a similar question in the post below, where I wanted to simply compute the difference of reach stock return to 1 ticker, namely SPY.

Computing excess returns

The solution presented there was this:

def func(row):
    date, asset = row.name
    return df.loc[(date, asset), 'returns'] - df.loc[(date, 'SPY'), 'returns']



dict0 = {'date': ['1/1/2020', '1/1/2020', '1/1/2020', '1/2/2020', '1/2/2020',
                  '1/2/2020', '1/3/2020', '1/3/2020', '1/3/2020'],
         'ticker': ['SPY', 'AAPL', 'MSFT', 'SPY', 'AAPL', 'MSFT', 'SPY', 'AAPL', 'MSFT'],
         'returns': [1, 2, 3, 4, 5, 6, 7, 8, 9]}
df = pd.DataFrame(dict0)  ###
df = df.set_index(['date', 'ticker'])
df['excess_returns'] = df.apply(func, axis=1)

But I haven't been able to modify it so that I can do this sector based. I appreciate any suggestions.

Upvotes: 0

Views: 80

Answers (1)

rudolfovic
rudolfovic

Reputation: 3286

You are almost there:

def func(row):
    date, asset = row.name
    index = sector_to_index_mapping[row.sector]
    return df.loc[(date, asset), 'returns'] - df.loc[(date, index), 'returns']

Upvotes: 1

Related Questions