Reputation: 83
Using Python, I want to count the number of cells in a row that has data in it, in a pandas data frame and record the count in the leftmost cell of the row.
Upvotes: 6
Views: 22902
Reputation: 661
To count the number of cells missing data in each row, you probably want to do something like this:
df.apply(lambda x: x.isnull().sum(), axis='columns')
Replace df
with the label of your data frame.
You can create a new column and write the count to it using something like:
df['MISSING'] = df.apply(lambda x: x.isnull().sum(), axis='columns')
The column will be created at the end (rightmost) of your data frame.
You can move your columns around like this:
df = df[['Count', 'M', 'A', 'B', 'C']]
I'm wondering if your missing cells are actually empty strings as opposed to NaN
values. Can you confirm? I copied your screenshot into an Excel workbook. My full code is below:
df = pd.read_excel('count.xlsx', na_values=['', ' '])
df.head() # You should see NaN for empty cells
df['M']=df.apply(lambda x: x.isnull().sum(), axis='columns')
df.head() # Column M should report the values: first row: 0, second row: 1, third row: 2
df = df[['Count', 'M', 'A', 'B', 'C']]
df.head() # Column order should be Count, M, A, B, C
Notice the na_values
parameter in the pd.read_excel
method.
Upvotes: 13