user10389226
user10389226

Reputation: 109

How to get the missing record Row number and column names using python?

Using python and pandas, I would like to achieve the output below. Whenever there are Null or Nan values present in the file then it needs to print the both row number and column name.

import pandas as pd

# List of Tuples
employees = [('Stuti', 'Null', 'Varanasi', 20000),
        ('Saumya', 'NAN', 'NAN', 35000),
        ('Saumya', 32, 'Delhi', 30000),
        ('Aaditya', 40, 'Dehradun', 24000),
        ('NAN', 45, 'Delhi', 70000)
        ]

# Create a DataFrame object from list
df = pd.DataFrame(employees,
            columns =['Name', 'Age',
            'City', 'Salary'])
print(df)

Expected Output:

Row 0: column Age missing
Row 1: Column Age, column City missing
Row 4: Column Name missing

Upvotes: 0

Views: 292

Answers (1)

Quang Hoang
Quang Hoang

Reputation: 150785

Try isin to mask the missing values, then matrix multiply @ with the columns to concatenate them:

s = df.isin(['Null','NAN'])

missing = s.loc[s.any(1)] @ ('column ' + df.columns + ', ')
for r, val in missing.str[:-2].items():
    print(f'Row {r}: {val} is missing')

Output:

Row 0: column Age is missing
Row 1: column Age, column City is missing
Row 4: column Name is missing

Upvotes: 3

Related Questions