Reputation: 11477
I am trying to write a regular expression in Python that will match either a quoted string with spaces or an unquoted string without spaces. For example given the string term:foo
the result would be foo
and given the string term:"foo bar"
the result would be foo bar
. So far I've come up with the following regular expression:
r = re.compile(r'''term:([^ "]+)|term:"([^"]+)"''')
The problem is that the match can come in either group(1)
or group(2)
so I have to do something like this:
m = r.match(search_string)
term = m.group(1) or m.group(2)
Is there a way I can do this all in one step?
Upvotes: 2
Views: 2351
Reputation: 120598
Avoid grouping, and instead use lookahead/lookbehind assertions to eliminate the parts that are not needed:
s = 'term:foo term:"foo bar" term:bar foo term:"foo term:'
re.findall(r'(?<=term:)[^" ]+|(?<=term:")[^"]+(?=")', s)
Gives:
['foo', 'foo bar', 'bar']
Upvotes: 4
Reputation: 77099
It doesn't seem that you really want re.match
here. Your regex is almost right, but you're grouping too much. How about this?
>>> s
('xyz term:abc 123 foo', 'foo term:"abc 123 "foo')
>>> re.findall(r'term:([^ "]+|"[^"]+")', '\n'.join(s))
['abc', '"abc 123 "']
Upvotes: 1