Reputation: 6668
I have a dataframe that contains three columns, like below.
stock date price
abc 2020-01-20 102.3
efg 2020-01-20 36.2
xyz 2020-01-20 341
abc 2020-01-21 102.3
efg 2020-01-21 34.3
xyz 2020-01-21 321
abc 2020-01-22 104.1
efg 2020-01-22 35.2
xyz 2020-01-22 318
I want to calculate the daily returns for each stock, so end up with a dataframe that looks like below (please note that I haven't calculated the returns here I have just put some random numbers & obviously there will be no returns for the 20th),
Return
the return of a stock for a day is defined as the following,
return for stock efg on 21st = (price 21st / price 20th) - 1
stock date return
abc 2020-01-21 0
efg 2020-01-21 -0.23
xyz 2020-01-21 -1.53
abc 2020-01-22 0.86
efg 2020-01-22 0.15
xyz 2020-01-22 -0.18
I have tried the code below,
df_ret = pd.pivot_table(df_ret, values='price', index=['stock'], columns = 'date')
df_ret = df_ret.transpose()
rets = df_ret.pct_change()
rets = rets.transpose()
rets has the correct data but I need the data in the format shown in the 2nd table above. I'm not sure the best way of doing this?
udpate
df = df_ret.assign(ret = df_ret.groupby('stock')['price'].pct_change())
From using the line above I get the results below. Its close to be correct but the value on 22nd for 904000 should be nan.
stock date price ret
903600 2020-01-22 11741.189956 nan
903600 2020-01-23 11678.197357 -0.005365094955116612
903600 2020-01-24 11683.808897 0.00048051422907646746
903600 2020-01-27 11683.808897 0.0
904000 2020-01-22 2017.933988 -0.8272880012169546
904000 2020-01-23 1993.986017 -0.011867569079271667
904000 2020-01-24 2012.609294 0.009339722967576014
904000 2020-01-27 1975.147341 -0.01861362417021617
Upvotes: 0
Views: 1157
Reputation: 36714
Try this simple solution:
df['change_in_pct'] = (df.price - df.price.shift(1)) /df.price
Upvotes: 0
Reputation: 18377
This is how I would do it:
import pandas as pd
data = {'stock':['abc','efg','xyz','abc','efg','xyz','abc','efg','xyz'],'date':['2020-01-20','2020-01-20','2020-01-20','2020-01-21','2020-01-21','2020-01-21','2020-01-22','2020-01-22','2020-01-22'],'price':[101,202,303,102,242,403,204,420,422]}
df = pd.DataFrame(data)
df['return'] = df.groupby('stock')['price'].pct_change()
df = df.dropna(how='any')
print(df)
Output:
stock date price return
3 abc 2020-01-21 102 0.009901
4 efg 2020-01-21 242 0.198020
5 xyz 2020-01-21 403 0.330033
6 abc 2020-01-22 204 1.000000
7 efg 2020-01-22 420 0.735537
8 xyz 2020-01-22 422 0.047146
Upvotes: 1
Reputation: 1
Assuming df_ret has no index, I would unstack and restack.
df_ret.set_index(['date', 'stock'], inplace=True)
df_ret = df_ret.unstack(axis=0,level='stock')
rets = df_ret.pct_change()
rets=rets.stack()
rets.reset_index(inplace=True)
Upvotes: 0
Reputation: 75150
Just groupby + pct_change
works:
final = df.assign(return_ = df.groupby('stock')['price'].pct_change())
stock date price return_
0 abc 2020-01-20 102.3 NaN
1 efg 2020-01-20 36.2 NaN
2 xyz 2020-01-20 341.0 NaN
3 abc 2020-01-21 102.3 0.000000
4 efg 2020-01-21 34.3 -0.052486
5 xyz 2020-01-21 321.0 -0.058651
6 abc 2020-01-22 104.1 0.017595
7 efg 2020-01-22 35.2 0.026239
8 xyz 2020-01-22 318.0 -0.009346
Upvotes: 3