sayhello
sayhello

Reputation: 185

Find substring in pandas

I have a data, where there are words in some rows. For example:

Test String

(Test1) String

Test (String1)

I need to find a substring in brackets using pandas. So, output here will be ['Test1', 'String1']

I tried something like this, but I can't find a word exactly in brackets.

df['column'].str.extract('([A-Z]\w{0,})')

Upvotes: 1

Views: 996

Answers (1)

EdChum
EdChum

Reputation: 394409

You can use the following regex pattern:

In [180]:
df['text'].str.extract(r'\((\w+)\)')

Out[180]:
0        NaN
1      Test1
2    String1
Name: text, dtype: object

So this looks for any words that are present in brackets, here brackets need to be escaped \( for example, we also want to find all words so w+ is needed here.

If you want a list you can call dropna and then tolist:

In [185]:
df['text'].str.extract(r'\((\w+)\)').dropna().tolist()

Out[185]:
['Test1', 'String1']

Upvotes: 1

Related Questions