SGeuer
SGeuer

Reputation: 147

search for a partial string from list and add column with that parital str

I wont to search a df.column for a partial strings that I saved in a series and wont to create a new column with the str that I found in each row. A part of my question was solved by pandas: test if string contains one of the substrings in a list:

For example, say I have the series s = pd.Series(['cat','hat','dog','fog','pet']), and I want to find all places where s contains any of ['og', 'at'], I would want to get everything but pet.

The solution is:

>>> searchfor = ['og', 'at']
>>> s[s.str.contains('|'.join(searchfor))]
0    cat
1    hat
2    dog
3    fog
dtype: object

but I would like to get

         pet    contains
    0    cat    at
    1    hat    at
    2    dog    og
    3    fog    og
    dtype: object

Upvotes: 2

Views: 185

Answers (1)

jezrael
jezrael

Reputation: 862591

Use extract and if no match get NaNs, so add dropna:

searchfor = ['og', 'at']
df['new'] = df['pet'].str.extract('(' + '|'.join(searchfor) + ')', expand=False)
df = df.dropna(subset=['new'])
print (df)
   pet contains1 new
0  cat        at  at
1  hat        at  at
2  dog        og  og
3  fog        og  og

Upvotes: 2

Related Questions