Reputation: 66156
I need to validate input: valid variants are either number or empty string. What is the correspondent regular expression?
String pattern = "\d+|<what shoudl be here?>";
UPD: dont suggest "\d*" please, I'm just curious how to tell "empty string" in regexp.
Upvotes: 10
Views: 18635
Reputation: 383726
In this particular case, ^\d*$
would work, but generally speaking, to match pattern
or an empty string, you can use:
^$|pattern
^
and $
are the beginning and end of the string anchors respectively.|
is used to denote alternates, e.g. this|that
.In the so-called multiline mode (Pattern.MULTILINE/(?m)
in Java), the ^
and $
match the beginning and end of the line instead. The anchors for the beginning and end of the string are now \A
and \Z
respectively.
If you're in multiline mode, then the empty string is matched by \A\Z
instead. ^$
would match an empty line within the string.
Here are some examples to illustrate the above points:
String numbers = "012345";
System.out.println(numbers.replaceAll(".", "<$0>"));
// <0><1><2><3><4><5>
System.out.println(numbers.replaceAll("^.", "<$0>"));
// <0>12345
System.out.println(numbers.replaceAll(".$", "<$0>"));
// 01234<5>
numbers = "012\n345\n678";
System.out.println(numbers.replaceAll("^.", "<$0>"));
// <0>12
// 345
// 678
System.out.println(numbers.replaceAll("(?m)^.", "<$0>"));
// <0>12
// <3>45
// <6>78
System.out.println(numbers.replaceAll("(?m).\\Z", "<$0>"));
// 012
// 345
// 67<8>
matches
In Java, matches
attempts to match a pattern against the entire string.
This is true for String.matches
, Pattern.matches
and Matcher.matches
.
This means that sometimes, anchors can be omitted for Java matches
when they're otherwise necessary for other flavors and/or other Java regex methods.
Upvotes: 18
Reputation: 626738
To make any pattern that matches an entire string optional, i.e. allow a pattern match an empty string, use an optional group:
^(pattern)?$
^^ ^^^
See the regex demo
If the regex engine allows (as in Java), prefer a non-capturing group since its main purpose is to only group subpatterns, not keep the subvalues captured:
^(?:pattern)?$
The ^
will match the start of a string (or \A
can be used in many flavors for this), $
will match the end of string (or \z
can be used to match the very end in many flavors, and Java, too), and the (....)?
will match 1 or 0 (due to the ?
quantifier) sequences of the subpatterns inside parentheses.
A Java usage note: when used in matches()
, the initial ^
and trailing $
can be omitted and you can use
String pattern = "(?:\d+)?";
Upvotes: 0
Reputation: 5665
One of the way to view at the set of regular language as the closure of the below things:
Concreate regular language is concrete element of this closure.
I didn't find empty symbol in POSIX standard to express regular language idea from step (1).
But it is exist extra thing like question mark there which is by posix definition is the following:
(regexp|< EMPTY_STRING >)
So you can do in the following manner for bash, perl, and python:
echo 9023 | grep -E "(1|90)?23"
perl -e "print 'PASS' if (qq(23) =~ /(1|90)?23/)"
python -c "import re; print bool(re.match('^(1|90)?23$', '23'))"
Upvotes: 0
Reputation: 30228
Just as a funny solution, you can do:
\d+|\d{0}
A digit, zero times. Yes, it does work.
Upvotes: 1
Reputation: 336108
To explicitly match the empty string, use \A\Z
.
You can also often see ^$
which works fine unless the option is set to allow the ^
and $
anchors to match not only at the start or end of the string but also at the start/end of each line. If your input can never contain newlines, then of course ^$
is perfectly OK.
Some regex flavors don't support \A
and \Z
anchors (especially JavaScript).
If you want to allow "empty" as in "nothing or only whitespace", then go for \A\s*\Z
or ^\s*$
.
Upvotes: 3
Reputation: 10946
/^\d*$/
Matches 0 or more digits with nothing before or after.
Explanation:
The '^' means start of line. '$' means end of line. '*' matches 0 or more occurences. So the pattern matches an entire line with 0 or more digits.
Upvotes: 6