Piper Ramirez
Piper Ramirez

Reputation: 413

The code "df.dropna" in python erases my entire data frame, what is wrong with my code?

I want to drop all NaN variables in one of my columns but when I use df.dropna(axis=0, inplace=True) it erases my entire dataframe. Why is this happening?

I've used both df.dropna and df.dropna(axis=0, inplace=True) and it doesn't work to remove NaN.

I'm binning my data so i can run a gaussian model but I can't do that with NaN variables, I want to remove them and still have my dataframe to run the model.

Before and AFter Data

enter image description here

Upvotes: 12

Views: 8280

Answers (4)

Mike C
Mike C

Reputation: 1

My dataframe was deleting more rows than I expected. So people with my problem might see this page too. The answer was that I needed to precede the drop line with a reset_index line. Python was finding the index that needed deleting but there were duplicates of that index; so loads of rows got deleted that should have stayed. reset_index gave a unique index to the dataframe; and solved my issue.

Upvotes: 0

Carlost
Carlost

Reputation: 855

For anyone in the future. Try changing axis=0 to axis=1

df.dropna(axis=1, how = 'all')

Upvotes: 0

Lorenzo Bassetti
Lorenzo Bassetti

Reputation: 945

Default 'dropna' command uses 'how=any' , which means that it would delete each row which has 'any' NaN

This, as you found out, delete rows which have 'all' NaN columns

df.dropna(how='all', inplace=True)

or, more basic:

newDF = df.dropna(how='all')

Upvotes: 1

HassanSh__3571619
HassanSh__3571619

Reputation: 2077

Not sure about your case, but sharing the solution that worked on my case:

The ones that didn't work:

df = df.dropna() #==> make the df empty.
df = df.dropna(axis=0, inplace=True) #==> make the df empty.
df.dropna(axis=0, inplace=True) #==> make the df empty.

The one that worked:

df.dropna(how='all',axis=0, inplace=True) #==> Worked very well...

Thanks to Anky above for his comment.

Upvotes: 2

Related Questions