Reputation:
I need to take a string and extract every instance of a pattern and only the pattern.
String test = "This is a test string to experiment with regex by separating every instance of the word test and words that trail test";
So now the pattern would have to find the word test
as well as any words ahead and behind it that are not test
. So basically it would have to result in 3 instances of this pattern being found.
The 3 results that I'm expecting are as follows:
This is a test string to experiment with regex by separating every instance of the word
test and words that trail
test
I've played around with postive lookahead and negative lookahead on gskinner but no luck yet.
Upvotes: 2
Views: 157
Reputation: 74028
To follow up my comment, I could imagine splitting your test string with the pattern \btest\b
and then join the string parts left and right
String parts[] = test.split("\btest\b", -1);
for (int i = 0; i < parts.length - 1; ++i)
System.out.println(parts[i] + "test" + parts[i + 1]);
Upvotes: 0
Reputation: 92986
Try this
(\s*\b(?!test\b)[a-z]+\b\s*)*test(\s*\b(?!test\b)[a-z]+\b\s*?)*
See it here on Regexr.
In Java, I would replace [a-z]
with \p{L}
, but regexr does not support Unicode properties. \p{L}
is a Unicode code point with the property letter, this will match every letter in any language.
Explanation:
(\s*\b(?!test\b)[a-z]+\b\s*)*
is matching a series of words that are not "test". This is ensured by the negative lookahead assertion (?!test\b)
.
test
is matching "test"
and at the end the same again: match a series of words that are not "test" with again (\s*\b(?!test\b)[a-z]+\b\s*?)*
Upvotes: 3