gerdemb
gerdemb

Reputation: 11477

Python regular expression to match either a quoted or unquoted string

I am trying to write a regular expression in Python that will match either a quoted string with spaces or an unquoted string without spaces. For example given the string term:foo the result would be foo and given the string term:"foo bar" the result would be foo bar. So far I've come up with the following regular expression:

r = re.compile(r'''term:([^ "]+)|term:"([^"]+)"''')

The problem is that the match can come in either group(1) or group(2) so I have to do something like this:

m = r.match(search_string)
term = m.group(1) or m.group(2)

Is there a way I can do this all in one step?

Upvotes: 2

Views: 2351

Answers (2)

ekhumoro
ekhumoro

Reputation: 120598

Avoid grouping, and instead use lookahead/lookbehind assertions to eliminate the parts that are not needed:

s = 'term:foo term:"foo bar" term:bar foo term:"foo term:'
re.findall(r'(?<=term:)[^" ]+|(?<=term:")[^"]+(?=")', s)

Gives:

['foo', 'foo bar', 'bar']

Upvotes: 4

kojiro
kojiro

Reputation: 77099

It doesn't seem that you really want re.match here. Your regex is almost right, but you're grouping too much. How about this?

>>> s
('xyz term:abc 123 foo', 'foo term:"abc 123 "foo')
>>> re.findall(r'term:([^ "]+|"[^"]+")', '\n'.join(s))
['abc', '"abc 123 "']

Upvotes: 1

Related Questions