Reputation: 1705
I have written a code, but it doesn't work correctly. Here you can find my RegEx
, what I have as the input and what I expect as the output. I am using a non-capturing group, because I want to read the text unti I get "Bundle" word, but I don't want to include it in the captured one. But I don't know what I have done wrongly which causes it not to work.
Here is my code:
Pattern pattern = Pattern.compile(
"((Bundle\\s+Components)|(Included\\s+Components))\\s+(.*?)(?:Bundle)", Pattern.DOTALL);
Matcher matcher = pattern.matcher(tableInformation);
while (matcher.find()) {
String bundleComponents = matcher.group();
System.out.println(bundleComponents);
}
Here are the examples: Example 1:
Bundle Components bla blah\blabla?!()\\ANY CHARACTER IS POSSIBLE HERE, EVEN LINEBREAK,blah blah
Bundle Type
Example 2:
Included Components
blah blah, like above,
Bundle Type
output I expect for Ex. 1:
Bundle Components bla blah\blabla?!()\\ANY CHARACTER IS POSSIBLE HERE, EVEN LINEBREAK,blah blah
output I expect for Ex. 2:
Included Components
blah blah, like above,
What I get as the output for Ex. 2:
Bundle Components bla blah\blabla?!()\\ANY CHARACTER IS POSSIBLE HERE, EVEN LINEBREAK,blah blah
Bundle Type
What I get as the output for Ex. 2:
Included Components
blah blah, like above,
Bundle Type
Upvotes: 1
Views: 1811
Reputation: 3573
In Full Match you get everything that regex says about, even non-capturing groups. You need to get appropriate Match to get rid of non-capturing groups. The other solution is to use positive lookahead instead of capturing group. Check the regex below. I also removed some unnecessary (IMO) groups.
(?:Bundle\s+Components|Included\s+Components)\s+.*?(?=Bundle)
It results with only one, full, match.
PS: The sign of new line just before "Bundle" will be captured as well in this solution.
Upvotes: 1
Reputation: 6036
You can do this with positive lookahead, since with this one the pattern inside the lookahead group is not included in the match:
((?:Bundle\\s+Components)|(?:Included\\s+Components))\\s+(.*?)(?=Bundle)
(not tested)
Upvotes: 1