Reputation: 379
I am trying to use python's regex to recognize 3 tokens from the user which can all be composed of either letters or numbers.
Here is the code I was using:
match = re.search(r'(\w+)(\w+)(\w+)', inputStr)
if not match:
print("Not valid")
else:
numWord1 = match.group(0)
numword2 = match.group(1)
numWord3 = match.group(2)
print(num1)
where inputStr comes from raw_input(). I ran my code, and here's what I get for each input
I typed: 1 2 3
I got: Not valid
I typed: 11 22 33
I got: Not valid
I typed: 111 222 333
I got: 111
I typed: 1 hello 3
I got: hello
I typed: hello 2 3
I got: hello
I thought \w matched any letter, digit, or underscore, and by including the + I would get 1 or more instances of it in my group.
Upvotes: 1
Views: 7565
Reputation: 133544
thought \w matched any letter, digit, or underscore
Yes but \w
does not match spaces eg.
1 2 3
r'(\w+)(\w+)(\w+)'
Your pattern is looking for any letter digit or underscore, 3 times or more
Upvotes: 2
Reputation: 17041
\w
does not match the spaces between the numbers — as you correctly pointed out, \w
matches a letter, digit, or underscore, but not a space. Try:
match = re.search(r'(\w+)\s+(\w+)\s+(\w+)', inputStr)
\s+
matches one or more whitespace characters between the groups of digits.
Example as tested in Python 3:
>>> print(re.search(r'(\w+)\s+(\w+)\s+(\w+)',input('? ')).group(1))
? 1 2 3
1
Upvotes: 5