Reputation: 5818
I have the following string
@name Home @options {} @include h1,h2,h3 @exclude p,div,em
I want to split by regex and store it in a HashMap
like
@name->Home
@options->{}
@include->h1,h2,h3
@exclude->p,div,em
I used the below regex but it's matching entire String after @name
import java.util.regex.Matcher;
import java.util.regex.Pattern;
public class NewClass {
public static void main(String[] args) {
String regex = "((?<var>@(\\S)+) (?<val>.+) *)+";
String val = "@name Home @options {} @include h1,h2,h3 @exclude p,div,em";
Pattern pattern = Pattern.compile(regex);
Matcher matcher = pattern.matcher(val);
if (matcher.matches()) {
System.out.println(matcher.group("var"));
System.out.println(matcher.group("val"));
}
}
}
It output as
@name
Home @options {} @include h1,h2,h3 @exclude p,div,em
Upvotes: 3
Views: 122
Reputation: 88707
The problem with your regex is that you don't know the number of groups in your input, i.e. how many @xxx
groups there are. Thus you'll need to apply the regex multiple times, i.e. using a while-loop and matcher.find()
:
while (matcher.find()) {
System.out.println(matcher.group("var"));
System.out.println(matcher.group("val"));
}
That said your regex needs to match a single group only and assuming there's nothing other in between you basically match from the first @
to the next or the end of the input. Hence your expression could become (?<var>@(\S)+) (?<val>[^@]+)
.
That expression basically consts of 2 parts with a single space in between (you might want to change that to \s+
instead:
(?<var>@(\S)+)
matches the group name starting with @
and resuming with anything not a whitespace. Note that the inner group is not needed here, so just use \S+
- unless you want to extract the name without the @
.(?<val>[^@]+)
matches any sequence of at least one character that's not a @
, i.e. anything up the next @
or the end of the input. Note that you'd not match empty groups that way so if you want to match those as well you might want to change the quantifier to *
instead.Upvotes: 2
Reputation: 626802
Use (?<var>@\S+)\s+(?<val>\S+)
regex and instead of .matches
that requires a full string match, use while (matcher.find())
:
String regex = "(?<var>@\\S+)\\s+(?<val>\\S+)";
String val = "@name Home @options {} @include h1,h2,h3 @exclude p,div,em";
Pattern pattern = Pattern.compile(regex);
Matcher matcher = pattern.matcher(val);
Map<String, String> m = new HashMap<String, String>();
while (matcher.find()) {
m.put(matcher.group("var"), matcher.group("val"));
}
System.out.println(m); // => {@name=Home, @exclude=p,div,em, @include=h1,h2,h3, @options={}}
See the Java demo
Upvotes: 1
Reputation: 140427
Why use regexes for everything?
Just saying: a simple parser that just splits on "@" might be leading to easier to understand code.
That will result in an array "var value"; and in there, you just take the substring after the first space as value.
You see - you need other people to come up with a "correct" regex. That probably means that you have to turn to other people every time you want to enhance/rework/update that regex.
Upvotes: 0