Reputation: 63
I need a regex to match a string that:
0-9
and spacesMatches:
11 11111 1 1 1 1 1 1 1 11 1 1 1 1 1 1 1 1 1 1
No matches:
1 has only one digit 11111 has space at the end 11111 has space at beginning 12 digits are different 11: has other character
I know regex for each of my requirement. That way I'll use 4 regex tests. Can we do it in one regex?
Upvotes: 6
Views: 188
Reputation: 455122
Yes it can be done in one regex:
^(\d)(?:\1| )*\1$
Explanation:
^ - Start anchor
( - Start parenthesis for capturing
\d - A digit
) - End parenthesis for capturing
(?: - Start parenthesis for grouping only
\1 - Back reference referring to the digit capture before
| - Or
- A literal space
) - End grouping parenthesis
* - zero or more of previous match
\1 - The digit captured before
$ - End anchor
Upvotes: 14
Reputation: 80405
Consider this program:
#!/usr/bin/perl -l
$_ = "3 33 3 3";
print /^(\d)[\1 ]*\1$/ ? 1 : 0;
print /^(\d)(?:\1| )*\1$/ ? 1 : 0;
It produces the output
0
1
The answer is obvious when you look at the compiled regexes:
perl -c -Mre=debug /tmp/a
Compiling REx "^(\d)[\1 ]*\1$"
synthetic stclass "ANYOF[0-9][{unicode_all}]".
Final program:
1: BOL (2)
2: OPEN1 (4)
4: DIGIT (5)
5: CLOSE1 (7)
7: STAR (19)
8: ANYOF[\1 ][] (0)
19: REF1 (21)
21: EOL (22)
22: END (0)
floating ""$ at 1..2147483647 (checking floating) stclass ANYOF[0-9][{unicode_all}] anchored(BOL) minlen 1
Compiling REx "^(\d)(?:\1| )*\1$"
synthetic stclass "ANYOF[0-9][{unicode_all}]".
Final program:
1: BOL (2)
2: OPEN1 (4)
4: DIGIT (5)
5: CLOSE1 (7)
7: CURLYX[1] {0,32767} (17)
9: BRANCH (12)
10: REF1 (16)
12: BRANCH (FAIL)
13: EXACT < > (16)
15: TAIL (16)
16: WHILEM[1/1] (0)
17: NOTHING (18)
18: REF1 (20)
20: EOL (21)
21: END (0)
floating ""$ at 1..2147483647 (checking floating) stclass ANYOF[0-9][{unicode_all}] anchored(BOL) minlen 1
/tmp/a syntax OK
Freeing REx: "^(\d)[\1 ]*\1$"
Freeing REx: "^(\d)(?:\1| )*\1$"
Backrefs are just regular octal characters inside character classes!!
Upvotes: 2