Reputation: 63
I would like to ask your help with an "if statement" inside a function that I am using to aggregate some data in a dataframe. With this function I wanted to check if any of several strings are in other string in one column of my dataframe to return an specific value and the matching string.
This is what I have so far and it does what I need. For example, if "f"
and "k"
are in my string ("fk"
), once I apply my function on this row (find_string("fk")
), my function will return "success"
. Additionally I would also like to have the string that was found in the list, in this case 'f'
. Something like "success" + "f"
def find_string(b):
if "a" in b or "c" in b or "d" in b or "f" in b:
return "success" ## here I want to get the matching string
Any suggestion?
I am using python 2.7.13 with pandas library.
Upvotes: 1
Views: 46
Reputation: 21595
def find_string(b):
for c in ['a', 'c', 'd', 'f']:
if c in b:
return 'success ' + c
return 'failure'
>>> find_string('fk')
'success f'
Upvotes: 1
Reputation: 54223
You could simply use set intersections. It doesn't require any if
or loops and should be very efficient:
>>> set('try to find a substring') & set('acdf')
{'a', 'f', 'd'}
>>> set('no substring') & set('acdf')
set()
If you really want to use pandas, look at @Coldspeed's solution.
Upvotes: 1
Reputation: 402423
If you're using pandas, use str.extract
+ np.where
, it's much faster.
v = df['yourCol'].str.extract('([acdf])', expand=False)
df['newCol'] = np.where(v.isnull(), '', 'success' + v.astype(str))
Upvotes: 2