Reputation: 75
Given the following strings (stringToTest):
G2:7JAPjGdnGy8jxR8[RQ:1,2]-G3:jRo6pN8ZW9aglYz[RQ:3,4]
G2:7JAPjGdnGy8jxR8[RQ:3,4]-G3:jRo6pN8ZW9aglYz[RQ:3,4]
And the Pattern:
Pattern p = Pattern.compile("G2:\\S+RQ:3,4");
if (p.matcher(stringToTest).find())
{
// Match
}
For string 1 I DON'T want to match, because RQ:3,4 is associated with the G3 section, not G2, and I want string 2 to match, as RQ:3,4
is associated with G2 section.
The problem with the current regex is that it's searching too far and reaching the RQ:3,4
eventually in case 1 even though I don't want to consider past the G2 section.
It's also possible that the stringToTest might be (just one section):
G2:7JAPjGdnGy8jxR8[RQ:3,4]
The strings 7JAPjGdnGy8jxR8
and jRo6pN8ZW9aglYz
are variable length hashes.
Can anyone help me with the correct regex to use, to start looking at G2 for RQ:3,4
but stopping if it reaches the end of the string or -G (the start of the next section).
Upvotes: 3
Views: 122
Reputation: 626689
The problem is that \S
matches any whitespace char and the regex engine parses the text from left to right. Once it finds G2:
it grabs all non-whitespaces to the right (since \S*
is a ghreedy subpattern) and then backtracks to find the rightmost occurrence of RQ:3,4
.
In a general case, you may use
String regex = "G2:(?:(?!-G)\\S)*RQ:3,4";
See the regex demo. (?:(?!-G)\S)*
is a tempered greedy token that will match 0+ occurrences of a non-whitespace char that does not start a -G
substring.
If the hyphen is only possible in front of the next section, you may subtract -
from \S
:
String regex = "G2:[^\\s-]*RQ:3,4"; // using a negated character class
String regex = "G2:[\\S&&[^-]]*RQ:3,4"; // using character class subtraction
See this regex demo. [^\\s-]*
will match 0 or more chars other than whitespace and -
.
Upvotes: 1
Reputation: 5308
Try to use [^[]
instead of \S
in this regex: G2:[^[]*\[RQ:3,4
[^[]
means any character but [
(considering that strings like this: G2:7JAP[jGd]nGy8[]R8[RQ:3,4]
are not possible)
Upvotes: 0
Reputation: 784888
You may use this regex with a negative lookahead in between:
G2:(?:(?!G\d+:)\S)*RQ:3,4
RegEx Details:
G2:
: Match literal text G2:
(?:
Start a non-capture group
(?!G\d+:)
: Assert that we don't have a G<digit>:
ahead of us\S
: Match a non-whitespace character)*
: End non-capture group. Match 0 or more of thisRQ:3,4
: Match literal text RQ:3,4
In Java use this regex:
String re = "G2:(?:(?!G\\d+:)\\S)*RQ:3,4";
Upvotes: 2