Reputation: 1678
I want to read comments from .sql file and get the values:
<!--
@fake: some
@author: some
@ticket: ti-1232323
@fix: some fix
@release: master
@description: This is test example
-->
Code:
String text = String.join("", Files.readAllLines(file.toPath()));
Pattern pattern = Pattern.compile("^\\s*@(?<key>(fake|author|description|fix|ticket|release)): (?<value>.*?)$", Pattern.MULTILINE);
Matcher matcher = pattern.matcher(text);
while (matcher.find())
{
if (matcher.group("key").equals("author")) {
author = matcher.group("value");
}
if (matcher.group("key").equals("description")) {
description = matcher.group("value");
}
}
The first key in this case fake
is always empty. If I put author
for the first key it's again empty. Do you know how I can fix the regex pattern?
Upvotes: 1
Views: 71
Reputation: 163287
If the <!--
and -->
parts should be there, you could make use of the \G
anchor to get consecutive matches and keep the groups.
Note that the alternatives are already in a named capturing group (?<key>
so you don't have to wrap them in another group. The part in group value
can be non greedy as you are matching to the end of the string.
As @Wiktor Stribiżew mentioned, you are joining the lines back without a newline so the separate parts will not be matched using for example the anchor $
asserting the end of the string.
Pattern
(?:^<!--(?=.*(?:\R(?!-->).*)*\R-->)|\G(?!^))\R@(?<key>fake|author|description|fix|ticket|release): (?<value>.*)$
Explanation
(?:
Non capture group
^
Start of line<!--
Match literally(?=.*(?:\R(?!-->).*)*\R-->)
Assert an ending -->
|
Or\G(?!^)
Assert the end of the previous match, not at the start)
Close group\R@
Match a unicode newline sequence and @
(?<key>
Named group key, match any of the alternatives
fake|author|description|fix|ticket|release
):
Match literally(?<value>.*)$
Named group value Match any char except a newline until the end of the stringExample code
String text = String.join("\n", Files.readAllLines(file.toPath()));
String regex = "(?:^<!--(?=.*(?:\\R(?!-->).*)*\\R-->)|\\G(?!^))\\R@(?<key>fake|author|description|fix|ticket|release): (?<value>.*)$";
Pattern pattern = Pattern.compile(regex, Pattern.MULTILINE);
Matcher matcher = pattern.matcher(text);
while (matcher.find()) {
if (matcher.group("key").equals("author")) {
System.out.println(matcher.group("value"));
}
if (matcher.group("key").equals("description")) {
System.out.println(matcher.group("value"));
}
}
Output
some
This is test example
Upvotes: 0
Reputation: 521194
Use the following regex pattern:
(?<!\S)@(?<key>(?:fake|author|description|fix|ticket|release)): (?<value>.*?(?![^@]))
The negative lookbehind (?<!\S)
used above will match either whitespace or the start o the string, covering the initial edge case. The negative lookahead (?![^@])
at the end of the pattern will stop before the next @
term begins, or upon hitting the end of the input
String text = String.join("", Files.readAllLines(file.toPath()));
Pattern pattern = Pattern.compile("(?<!\\S)@(?<key>(?:fake|author|description|fix|ticket|release)): (?<value>.*?(?![^@]))", Pattern.DOTALL);
Matcher matcher = pattern.matcher(text);
while (matcher.find()) {
if ("author".equals(matcher.group("key")) {
author = matcher.group("value");
}
if ("description".equals(matcher.group("key")) {
description = matcher.group("value");
}
}
Upvotes: 1