Reputation: 1580
I want to match only alphabetic characters, i.e a-z
or A-Z
, which can also contain spaces. The intent is to match any multiword names like 'Vivek Jha'
. I expect the following Regex to work:
re.match(r'^[aA-zZ\s]+$', name)
It works for all the cases but also matches a word: 'Vivek_Jha'
I do not want and underscore to be matched. How is this _
getting matched.
I have worked on Regex in Perl and Tcl, but I think Python is doing something more that I can imagine.
Upvotes: 2
Views: 9260
Reputation: 107287
If you want to match only alphabetic characters,which can also contain spaces just use :
r'^[a-zA-Z ]+$'
note that aA-zZ
is wrong way for match letters you must use a-z
for lowercase and A-Z
for upper case .
Note :
The \s
metacharacter is used to find a whitespace character.
A whitespace character can be:
A space character
A tab character
A carriage return character
A new line character
A vertical tab character
A form feed character
Upvotes: 4
Reputation: 2022
Try a-zA-Z
instead of aA-zZ
.
a-z have nothing between them but letters, same for A-Z, but A-z have a lot of stuff in between... apparently including the underscore character.
Upvotes: 2
Reputation:
A-z
is capturing everything from ASCII character A
to ASCII character z
. This includes the _
character as well as many others. For more information on this, you can view Wikipedia's ASCII article.
To fix the problem, you need to do:
re.match(r'[a-zA-Z\s]+$', name)
This tells Python to only capture characters in the ASCII ranges a-z
and A-Z
.
Also, I removed the ^
because re.match
matches from the start of the string by default.
Upvotes: 6