hxin
hxin

Reputation: 1058

Isn't \d redundant in [\w\d]?

I am reading a book and see tons of examples like this:

(?P<email>
[\w\d.+-]+ # username
@
([\w\d.]+\.)+ # domain name prefix
(com|org|edu) # limit the allowed top-level domains
)

Since \w means [a-zA-Z0-9_], \d means [0-9], \d is subset of \w.
So, aren't those "\d"s redundant? Please someone confirm my understanding is correct as this drives me nut.

Upvotes: 6

Views: 495

Answers (1)

Russ Cox
Russ Cox

Reputation: 1921

Yes, this is redundant, and plain \w would work just as well. See https://docs.python.org/2/library/re.html

\d

When the UNICODE flag is not specified, matches any decimal digit; this is equivalent to the set [0-9]. With UNICODE, it will match whatever is classified as a decimal digit in the Unicode character properties database.

\w

When the LOCALE and UNICODE flags are not specified, matches any alphanumeric character and the underscore; this is equivalent to the set [a-zA-Z0-9_]. With LOCALE, it will match the set [0-9_] plus whatever characters are defined as alphanumeric for the current locale. If UNICODE is set, this will match the characters [0-9_] plus whatever is classified as alphanumeric in the Unicode character properties database.

Upvotes: 6

Related Questions