Reputation:

Drop all rows from DataFrame in which all values, except a specific one, is NaN

I have a very large DataFrame with many columns (almost 300). I would like to remove all rows in which the values of all columns, except a column called 'Country' is NaN.

dropna can remove rows in which all or some values are NaN. But what Is an efficient way to do it if there's a column you want to exclude from the process?

Upvotes: 2

Answers (3)

Minek Po1

Reputation: 162

This should work:

def remove_if_eq(v,dataframe,keys):
    for i in df.index:
        delete = True
        for j in df.columns:
            if df[j][i] != v:
                delete = False
                break
        if delete:dataframe = dataframe.drop(i)
    return(dataframe)

v is the value dataframe is the dataframe keys are the names of the colums you want searched

Upvotes: 0

Henry Yik

Reputation: 22493

You can use filter to exclude the column Country:

df = pd.DataFrame({"A":[nan, 2, 3,], "Countryside":[nan, 4, 6], "Country":list("ABC")})

     A  Countryside Country
0  NaN          NaN       A
1  2.0          4.0       B
2  3.0          6.0       C

print (df[~df.filter(regex=r"^(?!Country\b)").isnull().all(1)])

     A  Countryside Country
1  2.0          4.0       B
2  3.0          6.0       C

Upvotes: 0

anky

Reputation: 75080

Try:

mask = df.drop("Country",axis=1).isna().all(1) & df['Country'].notna()
out = df[~mask]

Upvotes: 1

Drop all rows from DataFrame in which all values, except a specific one, is NaN

Answers (3)

Related Questions