Reputation: 8431
On my dataset, i have a column as below:
hist = ['A','FAT',nan,'TAH']
Then i should use a loop to obtain the cells which contains an 'A'
. Here is my code:
import numpy as np
import pandas as pd
import math
from numpy import nan
for rowId in np.arange(dt.shape[0]):
for hist in np.arange(10):
if math.isnan(dt.iloc[rowId,hist])!=True:
if 'A' in dt.iloc[rowId,hist]:
print("A found in: "+str(dt.iloc[rowId,hist]))
In the line if 'A' in dt.iloc[rowId,hist]
when the value of dt.iloc[rowId,hist]
is NAN
then it complains with, TypeError: argument of type 'float' is not iterable
so i decided to add if math.isnan(dt.iloc[rowId,hist])!=True:
But, also this one leads to the below error:
TypeError: must be real number, not str
How may i find the values which contains 'A'?
Upvotes: 1
Views: 11821
Reputation: 477160
Instead of iterating over this, you can just use the .str.contains
[pandas-doc] on the column, like:
>>> df
0
0 A
1 FAT
2 NaN
3 TAH
>>> df[0].str.contains('A')
0 True
1 True
2 NaN
3 True
Name: 0, dtype: object
You can then for example filter or, obtain the indices:
>>> df[df[0].str.contains('A') == True]
0
0 A
1 FAT
3 TAH
>>> df.index[df[0].str.contains('A') == True]
Int64Index([0, 1, 3], dtype='int64')
or we can use .notna
instead of == True
:
>>> df[df[0].str.contains('A').notna()]
0
0 A
1 FAT
3 TAH
>>> df.index[df[0].str.contains('A').notna()]
Int64Index([0, 1, 3], dtype='int64')
or filter in the .contains()
like @Erfan says:
>>> df[df[0].str.contains('A', na=False)]
0
0 A
1 FAT
3 TAH
>>> df.index[df[0].str.contains('A', na=False)]
Int64Index([0, 1, 3], dtype='int64')
So you can print the values with:
for val in df[df[0].str.contains('A') == True][0]:
print('A found in {}'.format(val))
this gives us:
>>> for val in df[df[0].str.contains('A') == True][0]:
... print('A found in {}'.format(val))
...
A found in A
A found in FAT
A found in TAH
Upvotes: 1