Reputation: 6668

calculating stock returns from a dataframe with stacked prices

I have a dataframe that contains three columns, like below.

  stock    date         price
  abc      2020-01-20   102.3
  efg      2020-01-20   36.2
  xyz      2020-01-20   341
  abc      2020-01-21   102.3
  efg      2020-01-21   34.3
  xyz      2020-01-21   321
  abc      2020-01-22   104.1
  efg      2020-01-22   35.2
  xyz      2020-01-22   318

I want to calculate the daily returns for each stock, so end up with a dataframe that looks like below (please note that I haven't calculated the returns here I have just put some random numbers & obviously there will be no returns for the 20th),

Return

the return of a stock for a day is defined as the following,

return for stock efg on 21st = (price 21st / price 20th) - 1

stock    date         return
  abc      2020-01-21   0
  efg      2020-01-21   -0.23
  xyz      2020-01-21   -1.53
  abc      2020-01-22   0.86
  efg      2020-01-22   0.15
  xyz      2020-01-22   -0.18

I have tried the code below,

df_ret = pd.pivot_table(df_ret, values='price', index=['stock'], columns = 'date')   
df_ret = df_ret.transpose()
rets = df_ret.pct_change()
rets = rets.transpose()

rets has the correct data but I need the data in the format shown in the 2nd table above. I'm not sure the best way of doing this?

udpate

df = df_ret.assign(ret = df_ret.groupby('stock')['price'].pct_change())

From using the line above I get the results below. Its close to be correct but the value on 22nd for 904000 should be nan.

   stock    date        price           ret
   903600   2020-01-22  11741.189956    nan
   903600   2020-01-23  11678.197357    -0.005365094955116612
   903600   2020-01-24  11683.808897    0.00048051422907646746
   903600   2020-01-27  11683.808897    0.0
   904000   2020-01-22  2017.933988    -0.8272880012169546
   904000   2020-01-23  1993.986017    -0.011867569079271667
   904000   2020-01-24  2012.609294    0.009339722967576014
   904000   2020-01-27  1975.147341   -0.01861362417021617

Upvotes: 0

Answers (4)

Nicolas Gervais

Reputation: 36714

Try this simple solution:

df['change_in_pct'] = (df.price - df.price.shift(1)) /df.price

Upvotes: 0

Celius Stingher

Reputation: 18377

This is how I would do it:

import pandas as pd
data = {'stock':['abc','efg','xyz','abc','efg','xyz','abc','efg','xyz'],'date':['2020-01-20','2020-01-20','2020-01-20','2020-01-21','2020-01-21','2020-01-21','2020-01-22','2020-01-22','2020-01-22'],'price':[101,202,303,102,242,403,204,420,422]}
df = pd.DataFrame(data)
df['return'] = df.groupby('stock')['price'].pct_change()
df = df.dropna(how='any')
print(df)

Output:

  stock        date  price    return
3   abc  2020-01-21    102  0.009901
4   efg  2020-01-21    242  0.198020
5   xyz  2020-01-21    403  0.330033
6   abc  2020-01-22    204  1.000000
7   efg  2020-01-22    420  0.735537
8   xyz  2020-01-22    422  0.047146

Upvotes: 1

MCF

Reputation: 1

Assuming df_ret has no index, I would unstack and restack.

df_ret.set_index(['date', 'stock'], inplace=True)
df_ret = df_ret.unstack(axis=0,level='stock')
rets = df_ret.pct_change()
rets=rets.stack()
rets.reset_index(inplace=True)

Upvotes: 0

anky

Reputation: 75150

Just groupby + pct_change works:

final = df.assign(return_ = df.groupby('stock')['price'].pct_change())

  stock       date  price   return_
0   abc 2020-01-20  102.3       NaN
1   efg 2020-01-20   36.2       NaN
2   xyz 2020-01-20  341.0       NaN
3   abc 2020-01-21  102.3  0.000000
4   efg 2020-01-21   34.3 -0.052486
5   xyz 2020-01-21  321.0 -0.058651
6   abc 2020-01-22  104.1  0.017595
7   efg 2020-01-22   35.2  0.026239
8   xyz 2020-01-22  318.0 -0.009346

Upvotes: 3

calculating stock returns from a dataframe with stacked prices

Answers (4)

Related Questions