Reputation: 457
Friends I am applying R.E @"^(.*)([0-9/+-^]+)=([0-9]+)$"
on string "3u->4+5=8"
. While fetching Group[1]
it returns "3u->4+"
and Group[2]
it returns "5"
.
According to me,
Group[0]="3u->4+5=8"
Group[1]="3u->"
Group[2]="4+5"
Group[3]="8"
Should be there. Kindly help
Upvotes: 1
Views: 111
Reputation: 9644
Your issue is caused by the use of a greedy quantifier .*
than will try to "eat up" everything it can.
Use a lazy quantifier instead:
^(.*?)([0-9/+^-]+)=([0-9]+)
This will cause .*?
to match as little a possible while finding an overall match: the quantifier will stop at the 4
in your example.
Also don't forget -
is a special character inside of a character class, to escape it you need to put it at the beginning or the end ([...-]
) or [+-^]
will become a range.
What's happening
Our regex (.*)([0-9/+-^]+)
, like any other regex, wants to return a match. In order to do that, it needs to find: "anything with any length, followed by at least a character in the [0-9/+-^]
range".
Following only this rule, when applied on 3u->4+5
the regex could at first view match:
3u->4+
in the first group, 5
in the second (only one digit is required for the second group to match)3u->4
in the first group, +5
in the second3u->
in the first group, 4+5
in the secondSo, which one should we match?
In order to know which one to pick, the (heuristic and simplified) rule is:
*
quantifier is greedy it will always try to match the most it can*?
) it will match the least it can (while the regex is still returning a global match).You can read more on the subject here or here, where the general underlying rules and subtleties are being explained more in depth.
Upvotes: 3