Reputation: 347

HTML tag finder

I am trying to create a method to find and return the first tag in a given HTML string, and returns null if no such tag is found. (A tag would be something like <b>)

I looked through the String class methods but I can't find a method that can suit this purpose. I'm thinking my plan is to scan each word for a "<" then once it is found, scan for a ">", but am unsure of how to do so. Also wondering if I should put a while/for loop in there? Help is appreciated, thank you.

public class HTMLProcessor {

    public static void main(String[] args) {
    System.out.println(findFirstTag("<b>The man jumped.</b>"));
    }

    public static String findFirstTag(String text) {
    int firstIndex = text.indexOf("<");
    if (firstIndex >= 0) {
        String newText = text.substring(firstIndex);
        int secondIndex = newText.indexOf(">");

        return text.substring(firstIndex, secondIndex + 1);
    } else {
        return null;
    }

}

Upvotes: 3

Answers (3)

Suresh Atta

Reputation: 121998

You can try with indexOf() and lastIndexOf() methods from String class.

You definitely need a HTML parser, Just pick one. Jsoup is one the best html parser.

Considering you are doing this multiple times and places.

And do not prefer much for regex while dealing with html strings

Upvotes: 2

zero_dev

Reputation: 653

Use regular expressions.

Pattern p = Pattern.compile("<([A-Z][A-Z0-9]*)\\b[^>]*>(.*?)</\\1>"); 
Matcher m = p.matcher(yourText);

Will match things like <b>this is bold</b>

Upvotes: 2

kddeisz

Reputation: 5192

Take a look at java regular expressions here. If you need an introduction to regex look here. This is probably the quickest way to accomplish what you're looking for.

Upvotes: 1

HTML tag finder

Answers (3)

Related Questions