Reputation: 121
I am building a JSP but I am new to regex and am having some trouble. I have a very long string with a pattern that looks like this:
==SOME_ID== - item 1 - item 2 - item 3 .. item 100 == SOME_ID_2 == - item 1 - item 2 - item 3 ... item 100 == SOME_ID_3 == ...
so it has the "identifier" which is enclosed in '==' characters, followed by a dash "-" separated list. I am trying to extract the Indentifiers and their item elements. Once I have the information extracted from the string I plan on constructing an XML document with the information.
One more note, an "item" can be more than one word.
EDIT: here is my code so far
<%
String testStr = (String)pageContext.getAttribute("longStr");
String[] ids = null;
String delimeterRegex = "(?i),==*==";
ids = testStr.split(delimeterRegex);
pageContext.setAttribute("ids", ids);
%>
<c:forEach items="${ids}" var="id">
${id}
</c:forEach>
Any help would be greatly appreciated. Thank you
Upvotes: 0
Views: 377
Reputation: 425003
Here's some code that will create a map of the name to the array of its values:
Map<String, String[]> map = new HashMap<String, String[]>();
for (String mapping : input.split("(?<!^)(?===\\s*\\w+\\s*==)")) {
String name = mapping.replaceAll("^==\\s*(\\w+).*", "$1");
String[] values = mapping.replaceAll("^==\\s*\\w+\\s*==\\s*-*\\s*", "").split("\\s*-\\s*");
map.put(name, values);
}
This first splits using a look-ahead that matches on a "name" - look aheads are non-capturing, thus preserving the name for the next step.
The name-and-values String then has the name part extracted and the values parts is split on a dash. All regex matches are done such that whitespace is trimmed from targets.
I've tested it and it works well - stripping off any optional whitespace around name and values.
Upvotes: 1
Reputation: 726559
You can use this regular expression:
==([^=]+)==([^=]+)(?=(?:=|$))
This expression captures a string between two pairs of equal signs, and then takes everything until the next =
or the end of string. The ID
becomes the first capturing group; the data becomes the second one. Groups are numbered from one, not from zero (group zero is special - it represents the entire match).
Here is a complete example:
String data = "==SOME_ID== - item 1 - item 2 - item 3 .. item 100 == SOME_ID_2 == - item 1 - item 2 - item 3 ... item 100 == SOME_ID_3 == ...";
Pattern p = Pattern.compile("==([^=]+)==([^=]+)(?=(?:=|$))");
Matcher m = p.matcher(data);
while (m.find()) {
System.out.println("ID="+m.group(1));
System.out.println("Data="+m.group(2));
}
ID=SOME_ID
Data= - item 1 - item 2 - item 3 .. item 100
ID= SOME_ID_2
Data= - item 1 - item 2 - item 3 ... item 100
ID= SOME_ID_3
Data= ...
Once you get your data
(i.e. group(2)
) you could run a String.split
on the dash to separate out the individual data elements.
Upvotes: 2