How to drop the rows if and only if values of particular columns are missing?

Question

I need to handle the missing values by dropping a particular row or column. I know how to drop a column or row with missing values, but what I need is a little bit different. I need to drop the rows if and only if two particular columns have missing values.

Here is the example I have tried:

# Basic libraries
import os
import pandas as pd
import numpy as np

data_dict = {'First':[100, 90, np.nan, np.nan], 
        'Second': [30, 45, 56, np.nan], 
        'Third':[np.nan, 40, 80, np.nan],
        'Forth': [30,40,50,np.nan]}

df1 = pd.DataFrame(data_dict)
df1.dropna(subset=['First','Second'],inplace=True)
df1

As shown, I am trying to drop the rows where First==Nan and Second==Nan. In other words, I need both conditions to be correct. So, I need just the last row to be dropped, where both values are Nan. So, row 2 will not be deleted since one of the conditions is satisfied. Unfortunately, the command I've used drops the rows if one of the conditions is satisfied.

jezrael · Accepted Answer

Use how='all' parameter in DataFrame.dropna, for you not working because default value is how='any':

how: {'any', 'all'}, default 'any'
Determine if row or column is removed from DataFrame, when we have at least one NA or all NA.
'any' : If any NA values are present, drop that row or column.
'all' : If all values are NA, drop that row or column.

df1.dropna(subset=['First','Second'],inplace=True, how='all')
print (df1)
   First  Second  Third  Forth
0  100.0    30.0    NaN   30.0
1   90.0    45.0   40.0   40.0
2    NaN    56.0   80.0   50.0

EDIT: For pass variable list use:

list_columns = ['First','Second','Third','Forth']
df1.dropna(subset=list_columns,inplace=True, how='all')

How to drop the rows if and only if values of particular columns are missing?

Answers (1)

Related Questions