jjwdesign
jjwdesign

Reputation: 3322

Does regular expression \d match minus sign and/or decimal point?

I'm look at some old PERL/CGI code to debug an issue and noticed a lot of uses of:

\d - Match non-digit character
\D - Match digit character

Most online docs mention that \d is the same as [0-9], which is what I've always thought of it as. But, I've also noticed Stackoverflow Questions that mention character set difference.

Does "\d" in regex mean a digit?

Does \d also match a minus sign and/or decimal point?

I'm off to do some testing.

Upvotes: 2

Views: 4120

Answers (3)

nhahtdh
nhahtdh

Reputation: 56819

I don't know how Perl determine whether to use Unicode or ASCII or locale by default (no flag, no use). Regardless, by declaring use re '/a'; (ASCII), or use re '/u'; (Unicode), or use re '/l'; (locale), you will clearly signify to the Perl interpreter (and human reader) which mode you want to use and avoid unexpected behaviour.

Due to the effect of modifiers, \d has at least 2 meanings:

  • Under effect of /a flag (ASCII), \d will match digits from 0 to 9 (no more and no less).
  • Under effect of /u flag (Unicode), \d will match any decimal digit in any language, and is equivalent to \p{Digit}reference. This effectively makes \d+ pretty useless and dangerous to use, since it allows a mix of digits in any languages.

    Quote from description of /u flag

    And, \d+ , may match strings of digits that are a mixture from different writing systems, creating a security issue. num() in Unicode::UCD can be used to sort this out. Or the /a modifier can be used to force \d to match just the ASCII 0 through 9.

\d will not match any sign or punctuation, since those characters does not belong to Nd (Number, decimal digit) General Category of Unicode.

Upvotes: 8

David W.
David W.

Reputation: 107080

The answer is no. It merely does a digit check. However, Unicode makes things a bit more complex.

If you want to make sure something is a number -- a decimal number -- ake a look at the Scalar::Util module. One of the functions it has is look_like_number. This can be used to see if the string you're looking at could be a number or not, and works better than trying to use a regular expression.

This module has been part of standard Perl for a while, so you should have it on your system.

Upvotes: 3

Kent
Kent

Reputation: 195209

Does \d also match a minus sign and/or decimal point?

NO

Upvotes: 12

Related Questions