Category Condition on multiple columns

Question

I have a data set as below:

ID  A1      A2
0   A123     1234
1   1234     5568
2   5568     NaN
3   Zabc     NaN
4   3456     3456
5   3456     3456
6   NaN    NaN
7   NaN    NaN

Intention is to go through each column (A1 and A2), identify where both the columns are blank as in row 6 and 7, create a new column and categorise as "Both A1 and A2 are blank"

I used the below code:

df['Z_Tax No Not Mapped'] = np.NaN

df['Z_Tax No Not Mapped'] = np.where((df['A1'] == np.NaN) & (df['A2'] == np.NaN), 1, 0)

However the output captures all the rows as 0 under new column 'Z_Tax No Not Mapped', but the data have instances where both the columns are blank. Not sure where i'm making a mistake to filter such cases.

Note: Columns A1 and A2 are sometimes alphanumeric or just numeric.

Idea is to place a category in a separate column as "IDs are not updated" or "IDs are updated", so that by placing a simple filter on "IDs are not updated" we can identify cases that are blank in both columns.

jezrael · Accepted Answer

Use DataFrame.isna with DataFrame.all for test if all columns are Trues - missing values:

df['Z_Tax No Not Mapped'] = np.where(df[['A1','A2']].isna().all(axis=1),
                                     'Both A1 and A2 are blank', 
                                      '')

Category Condition on multiple columns

Answers (2)

Related Questions