Eliot
Eliot

Reputation: 2429

Finding column values for the first and last row in the previous year

I have the price history for Google stock in a Google Colab doc, like this:

df = pd.DataReader('GOOG', data_source='yahoo', start='08-01-2004')

These are open, high, low, close and adjusted close prices for each trading day in the price history. I can create a new column in the DataFrame for the rate of return of the stock over the trailing 12 months like this:

df['Trailing 12 month return'] = (df['Adj Close'] - 
  df['Adj Close'].shift(DAYS_TRADING_PER_YEAR)) / 
  df['Adj Close'].shift(DAYS_TRADING_PER_YEAR)

But what if what I actually want is one value for rate of the return per year, looking at the return over the previous calendar year? So, for 2015, just find the first trading day (more correctly, the first day for which we have data) in 2014 and the last trading day in 2014 and get the percentage change over that period?

Upvotes: 2

Views: 72

Answers (1)

tdy
tdy

Reputation: 41327

Assuming Date is a proper datetime column:

  1. groupby(df.Date.dt.year) to group by year
  2. apply() the yearly rates computed from first_valid_index() and last_valid_index()
  3. shift() the results to get the previous year
rates = df.groupby(df.Date.dt.year)['Adj Close'].apply(
    lambda g: (g.loc[g.last_valid_index()] - g.loc[g.first_valid_index()]) / g.loc[g.first_valid_index()]
).shift()

# Date
# 2014         NaN
# 2015    0.128019
# 2016    2.232232
# 2017    1.041269
# 2018    0.292042
# 2019    0.154558
# 2020   -0.136102
# 2021    0.396961
# Name: Adj Close, dtype: float64
  1. Then map() these rates with df.Date.dt.year to create the new column:
df['Previous year rate of return'] = df.Date.dt.year.map(rates)

#            Date   Adj Close  Previous year rate of return
# 0    2014-08-01  166.724074                           NaN
# 1    2014-08-02   69.634211                           NaN
# ...         ...         ...                           ...
# 999  2017-04-26  165.225121                      1.041269
# 1000 2017-04-27   40.165297                      1.041269
# ...         ...         ...                           ...
# 2433 2021-03-30   67.864861                      0.396961
# 2434 2021-03-31   31.408317                      0.396961

Upvotes: 1

Related Questions