Reputation: 73
I am Looking for Regex for this result
String = This is Cold Water and this is Hot Water, have some Water.
I want to check whether this String has the word 'Water' which doesn't have these 'Cold' and 'Hot' words before it.
String mydata = "This is Cold Water and this is Hot Water, have some Water";
Pattern pattern = Pattern.compile("[^(Cold|Hot)]\sWater");
Matcher matcher = pattern.matcher(mydata);
if (matcher.matches()) {
String s = matcher.group(1);
System.out.println(s);
}
But it is resulting a no match
Upvotes: 4
Views: 3236
Reputation: 626748
The [^(Cold|Hot)]\sWater
pattern matches any char other than (
, C
, o
... )
, then a single whitespace and then a Water
substring. The [^...]
is a negated character class, you can't negate sequences of chars with it.
You may use a regex with a negative lookbehind. The most basic form of it for your case is (?<!Cold\s|Hot\s)
, and you may further customize it.
For example, the \s
only matches 1 whitespace, and the lookbehind won't work if there are 2 or more whitespaces between Cold
and Water
or Hot
and Water
. In Java regex, you may use limiting quantifiers (see Constrained-width Lookbehind), so you may use \s{1,10}
to allow the lookbehind to "see" 1 to 10 whitespaces behind.
Another enhancement could be whole word matching, enclose the words with \b
, word boundary construct.
Note that Matcher#matches()
requires a full string match, you actually want to use Matcher#find()
.
Here is an example solution:
String mydata = "This is Cold Water and this is Hot Water, have some Water";
Pattern pattern = Pattern.compile("\\b(?<!(?:\\bCold\\b|\\bHot\\b)\\s{1,10})Water\\b");
Matcher matcher = pattern.matcher(mydata);
if (matcher.find()) {
System.out.println(matcher.group(0));
}
See the Java online demo.
Pattern details
\\b
- a word boundary(?<!
- start of the negative lookbehind that fails the match if, immediately to the left of the current location, there is:
(?:
- start of a non-capturing group matching either of the two alternatives:
\\bCold\\b
- a whole word Cold
|
- or\\bHot\\b
- a whole word Hot
)
- end of the non-capturing group\\s{1,10}
- 1 to 10 whitespaces (you may use \s
if you are sure there will only be 1 whitespace between the words))
- end of the lookbehindWater
- the search word\\b
- a word boundaryUpvotes: 5