Atiqul Islam
Atiqul Islam

Reputation: 83

Count non-empty cells in pandas dataframe rows and add counts as a column

Using Python, I want to count the number of cells in a row that has data in it, in a pandas data frame and record the count in the leftmost cell of the row.

Frame showing count column on left as requested

Upvotes: 6

Views: 22902

Answers (1)

Keith Dowd
Keith Dowd

Reputation: 661

To count the number of cells missing data in each row, you probably want to do something like this:

df.apply(lambda x: x.isnull().sum(), axis='columns')

Replace df with the label of your data frame.

You can create a new column and write the count to it using something like:

df['MISSING'] = df.apply(lambda x: x.isnull().sum(), axis='columns')

The column will be created at the end (rightmost) of your data frame.

You can move your columns around like this:

df = df[['Count', 'M', 'A', 'B', 'C']]

Update

I'm wondering if your missing cells are actually empty strings as opposed to NaN values. Can you confirm? I copied your screenshot into an Excel workbook. My full code is below:

df = pd.read_excel('count.xlsx', na_values=['', ' '])
df.head() # You should see NaN for empty cells
df['M']=df.apply(lambda x: x.isnull().sum(), axis='columns')
df.head() # Column M should report the values: first row: 0, second row: 1, third row: 2
df = df[['Count', 'M', 'A', 'B', 'C']]
df.head() # Column order should be Count, M, A, B, C

Notice the na_values parameter in the pd.read_excel method.

Upvotes: 13

Related Questions