scott martin
scott martin

Reputation: 1293

Pandas - Extract text between two strings

I have a Dataframe whose column has data in the below format:

---
- !ruby/hash:Control::Keys
  name: sample1
  value: 101

I am trying to extract just the name and values and store them as new column. I tried

df['col'].str.extract(r'name:(\w+)value')

but it returned NAN

Expected values:

name,value
sample1,101

Upvotes: 0

Views: 662

Answers (2)

abhilb
abhilb

Reputation: 5757

You can try

>>> df['names'] = df.col.str.extract(r'(?<=name:)\s+(\w+)')
>>> df['values'] = df.col.str.extract(r'(?<=value:)\s+(\w+)')
>>> df
                                                 col    names values
0  ---\n- !ruby/hash:Control::Keys\n  name: sampl...  sample1    101

Upvotes: 1

Sudhandar
Sudhandar

Reputation: 40

Try using this regex pattern:

r'(name: (\w+))|(value: (\w+))

Keep in mind the spaces.

You will get a list containing ['sample1',101].

Upvotes: 0

Related Questions