Reputation: 11093
In java, I've been trying to parse a log file using regex. Below one line of the log file.
I 20151007 090137 - com.example.Main - Main.doStuff (293): ##identifier (id:21): {};
I need the json string at the end of the line, and the id. Which means I need two capturing groups. So I started coding.
Pattern p = Pattern.compile(
"^I [0-9]{8} [0-9]{6} - com\\.example\\.Main - Main\\.doStuff \\(\\d+\\): ##identifier \\(id:(\\d+)\\): (.*?);$"
);
The (.*?)
at the end of the pattern is because it needs to be greedy, but give back the ;
at the very end of the input line.
Matcher m = p.matcher(readAboveLogfileLineToString());
System.err.println(m.matches() + ", " + m.groupCount());
for (int i = 0; i < m.groupCount(); i++) {
System.out.println(m.group(i));
}
However, above code outputs
true, 2
I 20151007 090137 - com.example.Main - Main.doStuff (293): ##identifier (id:21): {};
21
But where's my "rest" group? And why is the entire line a group? I've checked multiple online regex test sites, and it should work: http://www.regexplanet.com/advanced/java/index.html for example sees 3 capturing groups. Maybe it's to do with the fact that I'm currently using jdk 1.6?
Upvotes: 1
Views: 44
Reputation: 48404
The problem is that the groupCount
iteration is one of the few cases in Java where you actually need to reach the count
value to get all groups.
In this case, you need to iterate to group 2
, since group 0
actually represents the whole match.
Just increment your counter as such (notice the <=
instead of just <
):
for (int i = 0; i <= m.groupCount(); i++) {
The last text printed should be: {}
You can also skip group 0
an start your count at 1
directly, of course.
To summarize, the explicit groups marked in the Pattern
with parenthesis start from index 1
.
See documentation here.
Upvotes: 3