Jeff
Jeff

Reputation: 8431

How to skip NaN values in a loop?

On my dataset, i have a column as below:

hist = ['A','FAT',nan,'TAH']

Then i should use a loop to obtain the cells which contains an 'A'. Here is my code:

    import numpy as np
    import pandas as pd
    import math
    from numpy import nan

    for rowId in np.arange(dt.shape[0]):
        for hist in np.arange(10):
            if math.isnan(dt.iloc[rowId,hist])!=True:
                if 'A' in dt.iloc[rowId,hist]:
                    print("A found in: "+str(dt.iloc[rowId,hist]))

In the line if 'A' in dt.iloc[rowId,hist] when the value of dt.iloc[rowId,hist] is NAN then it complains with, TypeError: argument of type 'float' is not iterable

so i decided to add if math.isnan(dt.iloc[rowId,hist])!=True: But, also this one leads to the below error:

TypeError: must be real number, not str

How may i find the values which contains 'A'?

Upvotes: 1

Views: 11821

Answers (1)

willeM_ Van Onsem
willeM_ Van Onsem

Reputation: 477160

Instead of iterating over this, you can just use the .str.contains [pandas-doc] on the column, like:

>>> df
     0
0    A
1  FAT
2  NaN
3  TAH
>>> df[0].str.contains('A')
0    True
1    True
2     NaN
3    True
Name: 0, dtype: object

You can then for example filter or, obtain the indices:

>>> df[df[0].str.contains('A') == True]
     0
0    A
1  FAT
3  TAH
>>> df.index[df[0].str.contains('A') == True]
Int64Index([0, 1, 3], dtype='int64')

or we can use .notna instead of == True:

>>> df[df[0].str.contains('A').notna()]
     0
0    A
1  FAT
3  TAH
>>> df.index[df[0].str.contains('A').notna()]
Int64Index([0, 1, 3], dtype='int64')

or filter in the .contains() like @Erfan says:

>>> df[df[0].str.contains('A', na=False)]
     0
0    A
1  FAT
3  TAH
>>> df.index[df[0].str.contains('A', na=False)]
Int64Index([0, 1, 3], dtype='int64')

So you can print the values with:

for val in df[df[0].str.contains('A') == True][0]:
    print('A found in {}'.format(val))

this gives us:

>>> for val in df[df[0].str.contains('A') == True][0]:
...     print('A found in {}'.format(val))
... 
A found in A
A found in FAT
A found in TAH

Upvotes: 1

Related Questions