Reputation: 1437
Say i have a pandas DataFrame:
df = pd.DataFrame({'a': [1,2,3,'e',4], 'b': [1,2,3,4,5]})
I would like to have the index of where the element of df is a string. How can i do that other than by checking element after element which slow and inefficient?
Upvotes: 1
Views: 1126
Reputation: 210832
It's not exactly what you were asking. It rather returns you an index of elements that can't be converted to numeric values:
In [231]: df
Out[231]:
a b
0 1 1
1 2 2
2 3 3
3 e 4
4 4 5
In [232]: df.apply(pd.to_numeric, errors='coerce').isnull().any(1)
Out[232]:
0 False
1 False
2 False
3 True
4 False
dtype: bool
In [233]: df.loc[df.apply(pd.to_numeric, errors='coerce').isnull().any(1)]
Out[233]:
a b
3 e 4
Or more efficient variant from @Zero, which will check only string
(object
) columns:
In [237]: df.select_dtypes(['object']).apply(pd.to_numeric, errors='coerce').isnull().any(1)
Out[237]:
0 False
1 False
2 False
3 True
4 False
dtype: bool
In [238]: df[df.select_dtypes(['object']).apply(pd.to_numeric, errors='coerce').isnull().any(1)]
Out[238]:
a b
3 e 4
Upvotes: 2