John Tan
John Tan

Reputation: 1385

Using regex to extract substrings

I have a string:

s = r'"url" : "a", "meta": "b", "url" : "c"'

What I want is to capture the substring url: ... up to the ,, so the expected output is a list:

[r'"url" : "a"', r'"url" : "b"']

I am using:

re.findall(r'("url"):(.*),', s)

but all it does is to return the entire string. Is there something i am doing wrong?

Upvotes: 0

Views: 64

Answers (2)

Ammar Aslam
Ammar Aslam

Reputation: 670

Your last "," was beeing matched due to a greedy search, (.*?) is non greedy. Also the last comma is optional so that needs to be ignored if not present

import re

s = r'"url":"a","meta":"b","url":"c"'

print(re.findall(r'("url"):"(.*?)",?', s))

Upvotes: 3

Sergio Lema
Sergio Lema

Reputation: 1629

You must escape the , to avoid including the comma inside the group. Try this:

re.findall(r'(("url" :[^,]*),*)', s)

Upvotes: 1

Related Questions