Reputation: 23
Regex: /^(\d+)[^_]/gm
Test String: 12_34
I'd expect this regex not to match on test string, because \d+
is greedy eating the digits 1
and 2
and [^_]
fails on _
.
But it unexpected matches with only 1
in Group1. Where am I wrong?
I try to find a regular expression that matches the digits in test strings "12" or "12xx" but does not match on "12_xx"
Sample:
https://regex101.com/r/0QRTjs/1/
Dialect: In the end I'll use Microsoft System.Text.RegularExpressions.
Upvotes: 1
Views: 347
Reputation: 103884
\d+
has the ability to reduce the number of matches if that results in an overall match. By backtracking then 2
satisfies the match of [^_]
and 1
is captured.
See HERE
You can use a negative lookahead at the start of the match:
/^(?!\d+_)(\d+)/
See HERE
Or you can use an atomic group that disallows backtracking:
/^((?>\d+))(?:[^_]|$)/
See HERE
Or use a possessive quantifier of ++
which disallows backtracking:
/^\d++([^_]|$)/
See HERE
The possessive quantifier is likely the fastest...
Upvotes: 0
Reputation: 12711
\d+
will match with one or more digits.
Since you append [^_]
, it can only be followed by a non _
character.
Therefore \d+
cannot match 12
because it is followed by _
.
1
is the first matching group because it is followed by 2
which is not _
.
If you want to match lines with digits only there is a very simple expression:
^(\d+)$
Upvotes: 0