Reputation: 86915
<b>Topic1</b><ul>asdasd</ul><br/><b>Topic2</b><ul>....
I want to extract everything that comes after <b>Topic1</b>
and the next <b>
starting tag. Which in this case would be: <ul>asdasd</ul><br/>
.
Problem: it must not necessairly be the <b>
tag, but could be any other repeating tag.
So my question is: how can I dynamically extract those text? The only static thinks are:
<b>
, it might as well be <i>
or <strong>
or <h1>
etc.I know how to write the java code, but what would the regex be like?
String regex = ">Topic1<";
Matcher m = Pattern.compile(regex).matcher(text);
while (m.find()) {
for (int i = 1; i <= m.groupCount(); i++) {
System.out.println(m.group(i));
}
}
Upvotes: 1
Views: 201
Reputation: 371
Try this
String pattern = "\\<.*?\\>Topic1\\<.*?\\>"; // this will see the tag no matter what tag it is
String text = "<b>Topic1</b><ul>asdasd</ul><br/><b>Topic2</b>"; // your string to be split
String[] attributes = text.split(pattern);
for(String atr : attributes)
{
System.out.println(atr);
}
Will print out:
<ul>asdasd</ul><br/><b>Topic2</b>
Upvotes: 0
Reputation: 59681
The following should work
Topic1</(.+?)>(.*?)<\\1>
Input: <b>Topic1</b><ul>asdasd</ul><br/><b>Topic2</b><ul>
Output: <ul>asdasd</ul><br/>
Code:
Pattern p = Pattern.compile("Topic1</(.+?)>(.*?)<\\1>");
// get a matcher object
Matcher m = p.matcher("<b>Topic1</b><ul>asdasd</ul><br/><b>Topic2</b><ul>");
while(m.find()) {
System.out.println(m.group(2)); // <ul>asdasd</ul><br/>
}
Upvotes: 2