Mk D
Mk D

Reputation: 21

Pandas apply function, receiving KeyError 'Column Name'

My dataset has a column called age and I'm trying to count the null values.

I know it can be easily achieved by doing something like len(df) - df['age'].count(). However, I'm playing around with functions and just like to apply the function to calculate the null count.

Here is what I have:

def age_is_null(df):
    age_col = df['age']
    null = df[age_col].isnull()
    age_null = df[null]
    return len(age_null)

count = df.apply(age_is_null)
print (count)

When I do that, I received an error: KeyError: 'age'.

Can someone tells me why I'm getting that error and what should I change in the code to make it work?

Upvotes: 1

Views: 1345

Answers (3)

jezrael
jezrael

Reputation: 863741

You need DataFrame.pipe or pass DataFrame to function here:

#function should be simplify
def age_is_null(df):
    return df['age'].isnull().sum()


count = df.pipe(age_is_null)
print (count)

count = age_is_null(df)
print (count)

Error means if use DataFrame.apply then iterate by columns, so it failed if want select column age.

def func(x):
   print (x)

df.apply(func)

EDIT: For selecting column use column name:

def age_is_null(df):
    age_col = 'age' <- here
    null = df[age_col].isnull()
    age_null = df[null]
    return len(age_null)

Or pass selected column for mask:

def age_is_null(df):
    age_col = df['age']
    null = age_col.isnull()  <- here
    age_null = df[null]
    return len(age_null)

Upvotes: 2

Shubhangi Chaturvedi
Shubhangi Chaturvedi

Reputation: 157

You need to pass dataframe df while calling the function age_is_null.That's why age column is not recognised.

count = df.apply(age_is_null(df))

Upvotes: 0

Nirali Khoda
Nirali Khoda

Reputation: 1691

Instead of making a function, you can Try this

df[df["age"].isnull() == True].shape

Upvotes: 0

Related Questions