Reputation: 1058
I am reading a book and see tons of examples like this:
(?P<email>
[\w\d.+-]+ # username
@
([\w\d.]+\.)+ # domain name prefix
(com|org|edu) # limit the allowed top-level domains
)
Since \w
means [a-zA-Z0-9_]
, \d
means [0-9]
, \d
is subset of \w
.
So, aren't those "\d"s redundant? Please someone confirm my understanding is correct as this drives me nut.
Upvotes: 6
Views: 495
Reputation: 1921
Yes, this is redundant, and plain \w
would work just as well. See https://docs.python.org/2/library/re.html
\d
When the
UNICODE
flag is not specified, matches any decimal digit; this is equivalent to the set[0-9]
. WithUNICODE
, it will match whatever is classified as a decimal digit in the Unicode character properties database.
\w
When the
LOCALE
andUNICODE
flags are not specified, matches any alphanumeric character and the underscore; this is equivalent to the set[a-zA-Z0-9_]
. WithLOCALE
, it will match the set[0-9_]
plus whatever characters are defined as alphanumeric for the current locale. IfUNICODE
is set, this will match the characters[0-9_]
plus whatever is classified as alphanumeric in the Unicode character properties database.
Upvotes: 6