Reputation: 675
consider a pandas dataframe that has values such as 'a - b'. I would like to check for the occurrence of '-' anywhere across all values of the dataframe without looping through individual columns. Clearly a check such as the following won't work:
if '-' in df.values
Any suggestions on how to check for this? Thanks.
Upvotes: 1
Views: 110
Reputation: 294278
You can use replace
to to swap a regex match with something else then check for equality
df.replace('.*-.*', True, regex=True).eq(True)
Upvotes: 1
Reputation: 210842
I'd use stack()
+ .str.contains()
in this case:
In [10]: df
Out[10]:
a b c
0 1 a - b w
1 2 c z
2 3 d 2 - 3
In [11]: df.stack().str.contains('-').any()
Out[11]: True
In [12]: df.stack().str.contains('-')
Out[12]:
0 a NaN
b True
c False
1 a NaN
b False
c False
2 a NaN
b False
c True
dtype: object
Upvotes: 1
Reputation: 3130
Using NumPy: np.core.defchararray.find(a,s)
returns an array of indices where the substring s
appears in a
;
if it's not present, -1 is returned.
(np.core.defchararray.find(df.values.astype(str),'-') > -1).any()
returns True if '-'
is present anywhere in df
.
Upvotes: 0
Reputation: 18208
One way may be to try using flatten
to values
and list comprehension
.
df = pd.DataFrame([['val1','a-b', 'val3'],['val4','3', 'val5']],columns=['col1','col2', 'col3'])
print(df)
Output:
col1 col2 col3
0 val1 a-b val3
1 val4 3 val5
Now, to search for -
:
find_value = [val for val in df.values.flatten() if '-' in val]
print(find_value)
Output:
['a-b']
Upvotes: 0