Reputation: 31
so i'm trying to create a list of S&P500 returns for each year and I have a pandas dataframe with the dates and closing prices of the S&P index over the past however many years. the Dates column has the format year-month-day, and i'm trying to create a new column with only the years. Whenever I use the datetimeIndex with the parameter "Dates", which is one of the dataframe columns, it throws a key error. Any tips or suggestions?
code:
import pandas as pd
import yfinance as yf
import pandas_datareader.data as web
import datetime
df = web.DataReader('^GSPC', 'yahoo', start='1980-1-1', end='2021-1-1')
clear = ['High','Low','Volume', 'Open', 'Close']
df.drop(columns = clear, inplace = True,)
print(df)
df['year'] = pd.DatetimeIndex(df['Date']).year
print(df)
Dataframe in the compiler:
>>> %Run dff.py
Adj Close
Date
1980-01-02 105.760002
1980-01-03 105.220001
1980-01-04 106.519997
1980-01-07 106.809998
1980-01-08 108.949997
... ...
2020-12-24 3703.060059
2020-12-28 3735.360107
2020-12-29 3727.040039
2020-12-30 3732.040039
2020-12-31 3756.070068
[10340 rows x 1 columns]
error message:
Traceback (most recent call last):
File "C:\Users\charlie\AppData\Roaming\Python\Python37\site-packages\pandas\core\indexes\base.py", line 3080, in get_loc
return self._engine.get_loc(casted_key)
File "pandas\_libs\index.pyx", line 70, in pandas._libs.index.IndexEngine.get_loc
File "pandas\_libs\index.pyx", line 101, in pandas._libs.index.IndexEngine.get_loc
File "pandas\_libs\hashtable_class_helper.pxi", line 4554, in pandas._libs.hashtable.PyObjectHashTable.get_item
File "pandas\_libs\hashtable_class_helper.pxi", line 4562, in pandas._libs.hashtable.PyObjectHashTable.get_item
KeyError: 'Date'
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "C:\Users\charlie\Desktop\ISP\dff.py", line 12, in <module>
df['year'] = pd.DatetimeIndex(df['Date']).year
File "C:\Users\charlie\AppData\Roaming\Python\Python37\site-packages\pandas\core\frame.py", line 3024, in __getitem__
indexer = self.columns.get_loc(key)
File "C:\Users\charlie\AppData\Roaming\Python\Python37\site-packages\pandas\core\indexes\base.py", line 3082, in get_loc
raise KeyError(key) from err
KeyError: 'Date'
If the error can't be fixed, is there any other way I can do this?
Upvotes: 1
Views: 431
Reputation:
The Date
column is the index, so you have to refer to it by df.index
instead of df['Date']
or df.Date
:
Try this:
df['year'] = pd.DatetimeIndex(df.index).year
Upvotes: 3