Java Regex to get the text from HTML anchor (...) tags

Question

I'm trying to get a text within a certain tag. So if I have:

Found

I want to be able to retrieve the Found text.

I'm trying to do it using regex. I am able to do it if the .*" );

I think the last two parts - the ([a-zA-Z0-9 ]*).* - are ok but I don't know what to do for the first part.

Tim Pietzcker · Accepted Answer

As they said, don't use regex to parse HTML. If you are aware of the shortcomings, you might get away with it, though. Try

Pattern titleFinder = Pattern.compile("]*>(.*?)", Pattern.DOTALL | Pattern.CASE_INSENSITIVE);
Matcher regexMatcher = titleFinder.matcher(subjectString);
while (regexMatcher.find()) {
    // matched text: regexMatcher.group(1)
}

will iterate over all matches in a string.

It won't handle nested tags and ignores all the attributes inside the tag.

Java Regex to get the text from HTML anchor (<a>...</a>) tags

Answers (2)

Related Questions

Java Regex to get the text from HTML anchor (&lt;a&gt;...&lt;/a&gt;) tags

Answers (2)

Related Questions

Java Regex to get the text from HTML anchor (<a>...</a>) tags