Amrith Krishna
Amrith Krishna

Reputation: 2851

Extract multiple substrings from a string in pandas

I have a pandas Series where I have to extract all the substrings within parenthesis. A string might contain multiple such substrings as well as no such substrings as well. How can such a condition be handled

abc(def)ghi(jkl)aaa
jklmnopqr(jkl)
(ab)cde(ghi)
lmnoprst uvwxyz

If I use str.extract, I can obtain only one substring at a time from a string with a.str.extract('.*\((.*)\)'). So in effect, I miss the substring def.

How can this be solved.?

The desired outcome is

def
jkl
ab
ghi

Upvotes: 0

Views: 998

Answers (1)

Scott Boston
Scott Boston

Reputation: 153560

Try:

df[0].str.extractall(r'\((\w+)\)')

Output:

           0
  match     
0 0      def
  1      jkl
1 0      jkl
2 0       ab
  1      ghi

Upvotes: 2

Related Questions