Reputation: 109
Using python and pandas, I would like to achieve the output below. Whenever there are Null
or Nan
values present in the file then it needs to print the both row number and column name.
import pandas as pd
# List of Tuples
employees = [('Stuti', 'Null', 'Varanasi', 20000),
('Saumya', 'NAN', 'NAN', 35000),
('Saumya', 32, 'Delhi', 30000),
('Aaditya', 40, 'Dehradun', 24000),
('NAN', 45, 'Delhi', 70000)
]
# Create a DataFrame object from list
df = pd.DataFrame(employees,
columns =['Name', 'Age',
'City', 'Salary'])
print(df)
Expected Output:
Row 0: column Age missing
Row 1: Column Age, column City missing
Row 4: Column Name missing
Upvotes: 0
Views: 292
Reputation: 150785
Try isin
to mask the missing values, then matrix multiply @
with the columns to concatenate them:
s = df.isin(['Null','NAN'])
missing = s.loc[s.any(1)] @ ('column ' + df.columns + ', ')
for r, val in missing.str[:-2].items():
print(f'Row {r}: {val} is missing')
Output:
Row 0: column Age is missing
Row 1: column Age, column City is missing
Row 4: column Name is missing
Upvotes: 3