CreamStat
CreamStat

Reputation: 2185

Filtering null values from keys of dictionary- Python

I have a pandas data frame and created a dictionary based on columns of the data frame. The dictionary is almost well generated but the only problem is that I try to filter out the NaN value but my code doesn't work, so there are NaN as key in the dictionary. My code is the following:

for key,row in mr.iterrows():
    # With this line I try to filter out the NaN values but it doesn't work
    if pd.notnull(row['Company nameC']) and pd.notnull(row['Company nameA']) and pd.notnull(row['NEW ID'])  :
        newppmr[row['NEW ID']]=row['Company nameC']

The output is:

defaultdict(<type 'list'>, {nan: '1347 PROPERTY INS HLDGS INC', 1.0: 'AFLAC INC', 2.0: 'AGCO CORP', 3.0: 'AGL RESOURCES INC', 4.0: 'INVESCO LTD', 5.0: 'AK STEEL HOLDING CORP', 6.0: 'AMN HEALTHCARE SERVICES INC', nan: 'FOREVERGREEN WORLDWIDE CORP'

So, I don't know how to filer out the nan values and what's wrong with my code.

EDIT:

An example of my pandas data frames is:

        CUSIP           Company nameA   A�O     NEW ID  Company nameC
42020   98912M201       NaN             NaN     NaN     ZAP
42021   989063102       NaN             NaN     NaN     ZAP.COM CORP
42022   98919T100       NaN             NaN     NaN     ZAZA ENERGY CORP
42023   98876R303       NaN             NaN     NaN     ZBB ENERGY CORP

Upvotes: 0

Views: 2374

Answers (1)

soerium
soerium

Reputation: 583

Pasting an example - how to remove "nan" keys from your dictionary:

Lets create dict with 'nan' keys (NaN in numeric arrays)

>>> a = float("nan")
>>> b = float("nan")
>>> d = {a: 1, b: 2, 'c': 3}
>>> d
{nan: 1, nan: 2, 'c': 3}

Now, lets remove all 'nan' keys

>>> from math import isnan
>>> c = dict((k, v) for k, v in d.items() if not (type(k) == float and isnan(k)))
>>> c
{'c': 1}

Other scenario that works fine. Maybe I'm missing something ?

In [1]: import pandas as pd

In [2]: import numpy as np

In [3]: df = pd.DataFrame({'a':[1,2,3,4,np.nan],'b':[np.nan,np.nan,np.nan,5,np.nan]})

In [4]: df
Out[4]: 
    a   b
0   1 NaN
1   2 NaN
2   3 NaN
3   4   5
4 NaN NaN

In [5]: for key, row in df.iterrows(): print pd.notnull(row['a'])
True
True
True
True
False

In [6]: for key, row in df.iterrows(): print pd.notnull(row['b'])
False
False
False
True
False

In [7]: x = {}

In [8]: for key, row in df.iterrows():
   ....:     if pd.notnull(row['b']) and pd.notnull(row['a']):
   ....:         x[row['b']]=row['a']
   ....:         

In [9]: x
Out[9]: {5.0: 4.0}

Upvotes: 1

Related Questions