student
student

Reputation: 347

How to delete rows with NaN in a pandas dataframe?

I have this pandas dataframe which is actually a excel spreadsheet:

    Unnamed: 0  Date    Num     Company     Link    ID
0   NaN     1990-11-15  131231  apple...    http://www.example.com/201611141492/xellia...   290834
1   NaN     1990-10-22  1231    microsoft http://www.example.com/news/arnsno...     NaN
2   NaN     2011-10-20  123     apple   http://www.example.com/ator...  209384
3   NaN     2013-10-27  123     apple...    http://example.com/sections/th-shots/2016/...   098
4   NaN     1990-10-26  123     google  http://www.example.net/business/Drugmak...  098098
5   NaN     1990-10-18  1231    google...   http://example.com/news/va-rece...  NaN
6   NaN     2011-04-26  546     amazon...   http://www.example.com/news/home/20160425...    9809

I would like to remove all the rows that have NaN in the ID column and reindex the "index imaginary column":

    Unnamed: 0  Date    Num     Company     Link    ID
0   NaN     1990-11-15  131231  apple...    http://www.example.com/201611141492/xellia...   290834
1   NaN     2011-10-20  123     apple   http://www.example.com/ator...  209384
2   NaN     2013-10-27  123     apple...    http://example.com/sections/th-shots/2016/...   098
3   NaN     1990-10-26  123     google  http://www.example.net/business/Drugmak...  098098
4   NaN     2011-04-26  546     amazon...   http://www.example.com/news/home/20160425...    9809

I know that this can be done as follows:

df = df['ID'].dropna()

Or

df[df.ID != np.nan]

Or

df = df[np.isfinite(df['ID'])]

TypeError: ufunc 'isfinite' not supported for the input types, and the inputs could not be safely coerced to any supported types according to the casting rule ''safe''

Or

df[df.ID()]

Or:

df[df.ID != '']

And then:

df.reset_index(drop=True, inplace=True)

However, It didnt removed the NaN in ID. I am getting the former dataframe.

UPDATE

In:

df['ID'].values

Out:

array([ '....A lot of text....',
       nan,
       "A lot of text...",
       "More text",
       'text from the site',
       nan,
       "text from the site"], dtype=object)

Upvotes: 4

Views: 9641

Answers (2)

ℕʘʘḆḽḘ
ℕʘʘḆḽḘ

Reputation: 19405

try this

df = df[df.ID != 'nan']

Upvotes: 4

Mainul Islam
Mainul Islam

Reputation: 1276

Try df.dropna(axis = 1).

Or, df.dropna(axis = 0, subset = "ID")See if it helps.

Upvotes: 5

Related Questions