Matheus Barreto Alves
Matheus Barreto Alves

Reputation: 77

KeyError while trying to create a Column in a DataFrame

I'm trying to create a new column on an existing dataframe, but always having a "KeyError" problem. In my dataframe, i have a column with the date of birth, and i want to use this column to get the age of a client. The function that i use is

for i in range(len(df1)):
     df1['Idade'][i] = calculate_age(df1['Data de Nascimento'][i])

So far, there's nothing wrong with the function "calculate_age", but i always keep getting this:

    Traceback (most recent call last):

  File "<ipython-input-8-79d009216c4d>", line 2, in <module>
    df1['Idade'][i] = calculate_age(df1['Data de Nascimento'][i])

  File "/home/mbarreto/anaconda3/lib/python3.7/site-packages/pandas/core/frame.py", line 2927, in __getitem__
    indexer = self.columns.get_loc(key)

  File "/home/mbarreto/anaconda3/lib/python3.7/site-packages/pandas/core/indexes/base.py", line 2659, in get_loc
    return self._engine.get_loc(self._maybe_cast_indexer(key))

  File "pandas/_libs/index.pyx", line 108, in pandas._libs.index.IndexEngine.get_loc

  File "pandas/_libs/index.pyx", line 132, in pandas._libs.index.IndexEngine.get_loc

  File "pandas/_libs/hashtable_class_helper.pxi", line 1601, in pandas._libs.hashtable.PyObjectHashTable.get_item

  File "pandas/_libs/hashtable_class_helper.pxi", line 1608, in pandas._libs.hashtable.PyObjectHashTable.get_item

KeyError: 'Idade'

What i'm doing wrong?

Upvotes: 0

Views: 458

Answers (2)

stephen_mugisha
stephen_mugisha

Reputation: 897

You are trying to loop over a column that doesn't exist yet hence the KeyError.

Here is an alternative way you would do it using the pandas datetime module; Assuming the date of birth is a timestamp column in your dataframe (df1)

year_now = pd.datetime.now()
df1['age'] = year_now.year - pd.DatetimeIndex(df1['dateofbirth']).year

Upvotes: 0

Dev Khadka
Dev Khadka

Reputation: 5481

you are trying to index on column that is not yet created

you need to do it like this

df1['Idade'] = [calculate_age(df1['Data de Nascimento'][i]) for i in range(len(df1))]

or even cleaner

df1['Idade'] = df1['Data de Nascimento'].apply(calculate_age)

Upvotes: 1

Related Questions