Reputation: 1200
I have the following multiindex dataframe:
df
Out[44]:
Attributes Adj Close ... Volume new
Symbols ADANIPORTS.NS ASIANPAINT.NS AXISBANK.NS ... WIPRO.NS ZEEL.NS
Date ...
2015-06-12 304.541199 678.997864 536.043945 ... 5907771.0 779489.0 NaN
2015-06-15 302.589355 672.165955 534.683044 ... 2754255.0 1515607.0 NaN
2015-06-16 305.663574 678.324219 535.752380 ... 3010103.0 2013937.0 NaN
2015-06-17 303.516510 683.809143 533.905396 ... 3640391.0 2795132.0 NaN
2015-06-18 303.516510 701.417847 534.683044 ... 4764172.0 1354886.0 NaN
... ... ... ... ... ... ..
2020-06-04 339.200012 1633.750000 394.350006 ... 7397468.0 47567706.0 NaN
2020-06-05 341.149994 1638.550049 405.299988 ... 7245584.0 36525490.0 NaN
2020-06-08 343.299988 1634.250000 430.250000 ... 12213696.0 27555715.0 NaN
2020-06-09 346.049988 1642.050049 420.049988 ... 7845252.0 27072444.0 NaN
2020-06-10 346.549988 1635.949951 427.450012 ... 7391699.0 17115912.0 NaN
[1227 rows x 301 columns]
So when i run
df.columns.levels
Out[45]: FrozenList([['Adj Close', 'Close', 'High', 'Low', 'Open', 'Volume', 'new'], ['ADANIPORTS.NS', 'ASIANPAINT.NS', 'AXISBANK.NS', 'BAJAJ-AUTO.NS', 'BAJFINANCE.NS', 'BAJAJFINSV.NS', 'BHARTIARTL.NS', 'INFRATEL.NS', 'BPCL.NS', 'BRITANNIA.NS', 'CIPLA.NS', 'COALINDIA.NS', 'DRREDDY.NS', 'EICHERMOT.NS', 'GAIL.NS', 'GRASIM.NS', 'HCLTECH.NS', 'HDFC.NS', 'HDFCBANK.NS', 'HEROMOTOCO.NS', 'HINDALCO.NS', 'HINDUNILVR.NS', 'ICICIBANK.NS', 'INDUSINDBK.NS', 'INFY.NS', 'IOC.NS', 'ITC.NS', 'JSWSTEEL.NS', 'KOTAKBANK.NS', 'LT.NS', 'M&M.NS', 'MARUTI.NS', 'NESTLEIND.NS', 'NTPC.NS', 'ONGC.NS', 'POWERGRID.NS', 'RELIANCE.NS', 'SHREECEM.NS', 'SBIN.NS', 'SUNPHARMA.NS', 'TCS.NS', 'TATAMOTORS.NS', 'TATASTEEL.NS', 'TECHM.NS', 'TITAN.NS', 'ULTRACEMCO.NS', 'UPL.NS', 'VEDL.NS', 'WIPRO.NS', 'ZEEL.NS', '']])
Finally when i run:
df['Close']
Out[46]:
Symbols ADANIPORTS.NS ASIANPAINT.NS ... WIPRO.NS ZEEL.NS
Date ...
2015-06-12 312.049988 705.650024 ... 201.505997 333.700012
2015-06-15 310.049988 698.549988 ... 202.837997 339.250000
2015-06-16 313.200012 704.950012 ... 202.613007 345.000000
2015-06-17 311.000000 710.650024 ... 204.955994 351.100006
2015-06-18 311.000000 728.950012 ... 208.537994 347.000000
... ... ... ... ...
2020-06-04 339.200012 1633.750000 ... 216.250000 205.600006
2020-06-05 341.149994 1638.550049 ... 218.750000 206.199997
2020-06-08 343.299988 1634.250000 ... 226.449997 197.050003
2020-06-09 346.049988 1642.050049 ... 218.350006 192.399994
2020-06-10 346.549988 1635.949951 ... 217.000000 189.800003
So far so good. My quesiton is when i run the following command:
df['new'] =df['Close'].pct_change()
i get this:
df['new']
Out[40]:
Date
2015-06-12 NaN
2015-06-15 NaN
2015-06-16 NaN
2015-06-17 NaN
2015-06-18 NaN
..
2020-06-04 NaN
2020-06-05 NaN
2020-06-08 NaN
2020-06-09 NaN
2020-06-10 NaN
Name: new, Length: 1227, dtype: float64
Why am i getting NaN
? This is very odd because when i run:
df['Close'].pct_change()
Out[50]:
Symbols ADANIPORTS.NS ASIANPAINT.NS ... WIPRO.NS ZEEL.NS
Date ...
2015-06-12 NaN NaN ... NaN NaN
2015-06-15 -0.006409 -0.010062 ... 0.006610 0.016632
2015-06-16 0.010160 0.009162 ... -0.001109 0.016949
2015-06-17 -0.007024 0.008086 ... 0.011564 0.017681
2015-06-18 0.000000 0.025751 ... 0.017477 -0.011678
... ... ... ... ...
2020-06-04 -0.001031 -0.048236 ... 0.023184 0.050588
2020-06-05 0.005749 0.002938 ... 0.011561 0.002918
2020-06-08 0.006302 -0.002624 ... 0.035200 -0.044374
2020-06-09 0.008010 0.004773 ... -0.035769 -0.023598
2020-06-10 0.001445 -0.003715 ... -0.006183 -0.013513
[1227 rows x 50 columns]
What can i do to get pct_change
column? What i am expecting is to get a new column which will calculate the pct_change
of the closing price of each ticker. How can i do this?
Upvotes: 1
Views: 261
Reputation: 2663
You can add a new level with Close percent changes for all tickers using pd.concat()
, ie.:
df = pd.concat([df, df[['Close']].pct_change().rename(columns={'Close': 'Close_pct'})], axis=1)
Note the double brackets: df[['Close']]
- this preserves the column levels of the slice.
Upvotes: 1
Reputation: 29635
you can generate the list of columns from the columns of Close like:
df[[('New',tc) for tc in df['Close'].columns]] = df['Close'].pct_change()
with an example:
# random values and similar structure
np.random.seed(1)
df = pd.DataFrame(np.random.random(40).reshape(-1, 4),
columns=pd.MultiIndex.from_tuples([['Open', 'a'], ['Open', 'b'],
['Close', 'a'], ['Close', 'b']],
names=('lv1', 'lv2')),
index=pd.date_range(end='2020-06-12', periods=10, freq='D'))
#add the columns
df[[('New',tc) for tc in df['Close'].columns]] = df['Close'].pct_change()
print (df)
lv1 Open Close New
lv2 a b a b a b
2020-06-03 0.417022 0.720324 0.000114 0.302333 NaN NaN
2020-06-04 0.146756 0.092339 0.186260 0.345561 1627.507181 0.142982
2020-06-05 0.396767 0.538817 0.419195 0.685220 1.250585 0.982921
2020-06-06 0.204452 0.878117 0.027388 0.670468 -0.934666 -0.021529
2020-06-07 0.417305 0.558690 0.140387 0.198101 4.125932 -0.704532
2020-06-08 0.800745 0.968262 0.313424 0.692323 1.232574 2.494788
2020-06-09 0.876389 0.894607 0.085044 0.039055 -0.728661 -0.943589
2020-06-10 0.169830 0.878143 0.098347 0.421108 0.156420 9.782485
2020-06-11 0.957890 0.533165 0.691877 0.315516 6.035073 -0.250748
2020-06-12 0.686501 0.834626 0.018288 0.750144 -0.973567 1.377519
#see it is well structure
print (df.columns.levels)
FrozenList([['Close', 'Open', 'New'], ['a', 'b']])
Upvotes: 1