Reputation: 77
I'm trying to create a new column on an existing dataframe, but always having a "KeyError" problem. In my dataframe, i have a column with the date of birth, and i want to use this column to get the age of a client. The function that i use is
for i in range(len(df1)):
df1['Idade'][i] = calculate_age(df1['Data de Nascimento'][i])
So far, there's nothing wrong with the function "calculate_age", but i always keep getting this:
Traceback (most recent call last):
File "<ipython-input-8-79d009216c4d>", line 2, in <module>
df1['Idade'][i] = calculate_age(df1['Data de Nascimento'][i])
File "/home/mbarreto/anaconda3/lib/python3.7/site-packages/pandas/core/frame.py", line 2927, in __getitem__
indexer = self.columns.get_loc(key)
File "/home/mbarreto/anaconda3/lib/python3.7/site-packages/pandas/core/indexes/base.py", line 2659, in get_loc
return self._engine.get_loc(self._maybe_cast_indexer(key))
File "pandas/_libs/index.pyx", line 108, in pandas._libs.index.IndexEngine.get_loc
File "pandas/_libs/index.pyx", line 132, in pandas._libs.index.IndexEngine.get_loc
File "pandas/_libs/hashtable_class_helper.pxi", line 1601, in pandas._libs.hashtable.PyObjectHashTable.get_item
File "pandas/_libs/hashtable_class_helper.pxi", line 1608, in pandas._libs.hashtable.PyObjectHashTable.get_item
KeyError: 'Idade'
What i'm doing wrong?
Upvotes: 0
Views: 458
Reputation: 897
You are trying to loop over a column that doesn't exist yet hence the KeyError.
Here is an alternative way you would do it using the pandas datetime module; Assuming the date of birth is a timestamp column in your dataframe (df1)
year_now = pd.datetime.now()
df1['age'] = year_now.year - pd.DatetimeIndex(df1['dateofbirth']).year
Upvotes: 0
Reputation: 5481
you are trying to index on column that is not yet created
you need to do it like this
df1['Idade'] = [calculate_age(df1['Data de Nascimento'][i]) for i in range(len(df1))]
or even cleaner
df1['Idade'] = df1['Data de Nascimento'].apply(calculate_age)
Upvotes: 1