Reputation: 157
I need to know if a section of a string contains a specific word.
Example:
Search for color="
in <font
to >
<font color="black"> = <font color="black">
BlaBla <font color="red"> = <font color="red">
<font size="2" color="white"> = <font size="2" color="white">
<font size="2"> = false
<font size="10"><font color="black"><font size="10"> = <font color="black">
I use Java with String.matches()
Upvotes: 1
Views: 115
Reputation: 31245
You can handle this with regex but this is hazardous.
On the other hand, JSOUP is intended for that use case and very easy to use.
Example :
public static void main(String[] argv) throws Exception {
Document document = Jsoup.parse("<font id=\"myFont\" color=\"black\">");
Elements font = document.select("font");
for (Element element : font) {
System.out.println(element.attr("color"));
}
}
Output :
black
Upvotes: 2
Reputation: 8988
Try following regex:
(?<=\<)(\w+)[^<]*color.*?\>
Demo:
String data = "<font color=\"black\">";
String strFind = "color";
Pattern regex = Pattern.compile("(?<=<)(\\w+)[^<]*"+strFind+".*?>", Pattern.MULTILINE);
Matcher matcher = regex.matcher(data);
while (matcher.find()) {
String content = matcher.group(1) == null ? matcher.group() : matcher.group(1);
System.out.println(content);
}
Provided sample text, it will print name of tag containing desired string. In this case it will be font
Upvotes: 1
Reputation: 2939
For parsing HTML it should be better do it with JSOUP. For quick introduction start with cookbook.
Upvotes: 3
Reputation: 56779
Based just on your example test cases provided, you might be able to get away with a simple regular expression like this:
<font[^>]*color="[^"]+"[^>]*>
Demo: http://jpad.io/example/1u/36573959-example
However, as pointed out in the comments, regular expressions are generally not well-suited for processing HTML.
Upvotes: 2