Vane Leung
Vane Leung

Reputation: 65

KeyError: 0 when changing time format of data

I have a column of data that are date formatted as "%d%m%Y" like "15022016". I need to convert them as "%Y-%m-%d" like"2016-02-15".

The data frame have 911,462 rows, and the code is as below:

for i in range(0,911462):
    df['Date'][i]=datetime.datetime.strftime(datetime.datetime.strptime(df['Date'][i],"%d%m%Y"),"%Y-%m-%d")

Then I met with error as below:

Traceback (most recent call last):
  File "C:\Users\liangfan\AppData\Local\Programs\Python\Python36-32\lib\site-packages\pandas\core\indexes\base.py", line 2393, in get_loc
    return self._engine.get_loc(key)
  File "pandas\_libs\index.pyx", line 132, in pandas._libs.index.IndexEngine.get_loc (pandas\_libs\index.c:5239)
  File "pandas\_libs\index.pyx", line 154, in pandas._libs.index.IndexEngine.get_loc (pandas\_libs\index.c:5085)
  File "pandas\_libs\hashtable_class_helper.pxi", line 1207, in pandas._libs.hashtable.PyObjectHashTable.get_item (pandas\_libs\hashtable.c:20405)
  File "pandas\_libs\hashtable_class_helper.pxi", line 1215, in pandas._libs.hashtable.PyObjectHashTable.get_item (pandas\_libs\hashtable.c:20359)
KeyError: 0
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
  File "<input>", line 2, in <module>
  File "C:\Users\liangfan\AppData\Local\Programs\Python\Python36-32\lib\site-packages\pandas\core\frame.py", line 2062, in __getitem__
    return self._getitem_column(key)
  File "C:\Users\liangfan\AppData\Local\Programs\Python\Python36-32\lib\site-packages\pandas\core\frame.py", line 2074, in _getitem_column
    result = result[key]
  File "C:\Users\liangfan\AppData\Local\Programs\Python\Python36-32\lib\site-packages\pandas\core\frame.py", line 2062, in __getitem__
    return self._getitem_column(key)
  File "C:\Users\liangfan\AppData\Local\Programs\Python\Python36-32\lib\site-packages\pandas\core\frame.py", line 2069, in _getitem_column
    return self._get_item_cache(key)
  File "C:\Users\liangfan\AppData\Local\Programs\Python\Python36-32\lib\site-packages\pandas\core\generic.py", line 1534, in _get_item_cache
    values = self._data.get(item)
  File "C:\Users\liangfan\AppData\Local\Programs\Python\Python36-32\lib\site-packages\pandas\core\internals.py", line 3590, in get
    loc = self.items.get_loc(item)
  File "C:\Users\liangfan\AppData\Local\Programs\Python\Python36-32\lib\site-packages\pandas\core\indexes\base.py", line 2395, in get_loc
    return self._engine.get_loc(self._maybe_cast_indexer(key))
  File "pandas\_libs\index.pyx", line 132, in pandas._libs.index.IndexEngine.get_loc (pandas\_libs\index.c:5239)
  File "pandas\_libs\index.pyx", line 154, in pandas._libs.index.IndexEngine.get_loc (pandas\_libs\index.c:5085)
  File "pandas\_libs\hashtable_class_helper.pxi", line 1207, in pandas._libs.hashtable.PyObjectHashTable.get_item (pandas\_libs\hashtable.c:20405)
  File "pandas\_libs\hashtable_class_helper.pxi", line 1215, in pandas._libs.hashtable.PyObjectHashTable.get_item (pandas\_libs\hashtable.c:20359)
KeyError: 0

I check the raw data in excel, they are all fine so there should be no problems with the raw data. It's quite wired that Key Error is 0. I totally have no idea what's wrong with it and how to deal with it.

Thanks for reading and waiting for your help! :)

Upvotes: 2

Views: 1047

Answers (1)

jezrael
jezrael

Reputation: 862781

You need pandas.to_datetime with parameter format:

df = pd.DataFrame({'Date':[15022016,15022016]})
print (df)
       Date
0  15022016
1  15022016

df['Date'] = pd.to_datetime(df['Date'], format='%d%m%Y')
print (df)
        Date
0 2016-02-15
1 2016-02-15

print (df['Date'].dtype)
datetime64[ns]

Upvotes: 2

Related Questions