Tim Farmer
Tim Farmer

Reputation: 1

Regular expression group matching

I am trying to search for sequence of binary digits separated by white space like this:

>>> seq = '0 1 1 1 0 0 1 0'

so, I create the regex:

>>> pat = r'(\b[01]\b)+'

but following search returns only one digit:

>>> re.search(pat, seq).group(0)
'0'

What's wrong?

Upvotes: 0

Views: 221

Answers (2)

wim
wim

Reputation: 362707

You're very close, just missing a space in the pattern. Try pat = r'\b([01] )*[01]\b'

>>> import re
>>> seq = '0 1 1 1 0 0 1 0'
>>> pat = r'\b([01] )*[01]\b'
>>> re.search(pat, seq).group(0)
'0 1 1 1 0 0 1 0'
>>> re.search(pat, 'spam and 0 0 0 1 0eggs').group(0)
'0 0 0 1'

Upvotes: 2

Andrew Clark
Andrew Clark

Reputation: 208475

Your current regex has no way to match the whitespace, so it can only match a single character. You can either use the same regex with re.findall() to get all matches in the string, or modify your regex so it will continue matching even if it encounters white space.

Here is an example using re.findall():

>>> re.findall(r'(\b[01]\b)+', '0 1 1 1 0 0 1 0')
['0', '1', '1', '1', '0', '0', '1', '0']

Or by changing the regex to (\b[01]\b\s?)+ you can get the entire sequence in a single match:

>>> re.search(r'(\b[01]\b\s?)+', '0 1 1 1 0 0 1 0').group(0)
'0 1 1 1 0 0 1 0'

Upvotes: 0

Related Questions