Pandas.Drop all columns with missing values except 1 column

Question

Suppose we have a dataframe with following columns 'Age', 'Name', 'Sex', where 'Age' and 'Sex' contain missing values. I want to drop all columns with missing values except one column 'Age'. So that I have a df with 2 columns 'Name' and 'Age'. How can I do it ?

mrhd · Accepted Answer

This should do what you need:

import pandas as pd
import numpy as np

df = pd.DataFrame({
  'Age'  : [5,np.nan,12,43], 
  'Name' : ['Alice','Bob','Charly','Dan'],
  'Sex'  : ['F','M','M',np.nan]})

df_filt = df.loc[:,(-df.isnull().any()) | (df.columns.isin(['Age']))]

Explanation:

df.isnull().any()) checks for all columns if any value is None or NaN, the - means that only those columns are selected that do not meet that criterion.

df.columns.isin(['Age']) checks for all columns if their name is 'Age', so that this column is selected in any case.

Both conditions are connected by an OR (|) so that if either condition applies the column is selected.

Pandas.Drop all columns with missing values except 1 column

Answers (1)

Related Questions