jlt199
jlt199

Reputation: 2449

Python Pandas Dataframe, remove all rows where 'None' is the value in any column

I have a large dataframe. When it was created 'None' was used as the value where a number could not be calculated (instead of 'nan')

How can I delete all rows that have 'None' in any of it's columns? I though I could use df.dropna and set the value of na, but I can't seem to be able to.

Thanks

I think this is a good representation of the dataframe:

temp = pd.DataFrame(data=[['str1','str2',2,3,5,6,76,8],['str3','str4',2,3,'None',6,76,8]])

Upvotes: 50

Views: 155224

Answers (5)

Raj
Raj

Reputation: 188

if still None is not removed , we can do

df = df.replace(to_replace='None', value=np.nan).dropna()

the above solution worked partially still the None was converted to NaN but not removed (thanks to the above answer as it helped to move further) so then i added one more line of code that is take the particular column

df['column'] = df['column'].apply(lambda x : str(x))

this changed the NaN to nan now remove the nan

df = df[df['column'] != 'nan']

Upvotes: 1

Oenomaus
Oenomaus

Reputation: 21

im a bit late to the party, but this is prob the simplest method:

df.dropna(axis=0, how='any')

Parameters: axis='index/column' how='any/all'

axis '0' is for dropping rows (most common), and '1' will drop columns instead. and the parameter how will drop if there are 'any' None types in the row/ column, or if they are all None types (how='all')

Upvotes: 2

jlt199
jlt199

Reputation: 2449

Thanks for all your help. In the end I was able to get

df = df.replace(to_replace='None', value=np.nan).dropna()

to work. I'm not sure why your suggestions didn't work for me.

Upvotes: 39

piRSquared
piRSquared

Reputation: 294488

Setup
Borrowed @MaxU's df

df = pd.DataFrame([
    [1, 2, 3],
    [4, None, 6],
    [None, 7, 8],
    [9, 10, 11]
], dtype=object)

Solution
You can just use pd.DataFrame.dropna as is

df.dropna()

   0   1   2
0  1   2   3
3  9  10  11

Supposing you have None strings like in this df

df = pd.DataFrame([
    [1, 2, 3],
    [4, 'None', 6],
    ['None', 7, 8],
    [9, 10, 11]
], dtype=object)

Then combine dropna with mask

df.mask(df.eq('None')).dropna()

   0   1   2
0  1   2   3
3  9  10  11

You can ensure that the entire dataframe is object when you compare with.

df.mask(df.astype(object).eq('None')).dropna()

   0   1   2
0  1   2   3
3  9  10  11

Upvotes: 53

MaxU - stand with Ukraine
MaxU - stand with Ukraine

Reputation: 210882

UPDATE:

In [70]: temp[temp.astype(str).ne('None').all(1)]
Out[70]:
      0     1  2  3  4  5   6  7
0  str1  str2  2  3  5  6  76  8

Old answer:

In [35]: x
Out[35]:
      a     b   c
0     1     2   3
1     4  None   6
2  None     7   8
3     9    10  11

In [36]: x = x[~x.astype(str).eq('None').any(1)]

In [37]: x
Out[37]:
   a   b   c
0  1   2   3
3  9  10  11

or bit nicer variant from @roganjosh:

In [47]: x = x[x.astype(str).ne('None').all(1)]

In [48]: x
Out[48]:
   a   b   c
0  1   2   3
3  9  10  11

Upvotes: 10

Related Questions