Extract text outside of a HTML Tag

Question

I have the following HTML code:

Text #1 "Another Text 1"
Text #2 "Another Text 2"

I want to extract the Text outside the tag, "Another Text 1" and "Another Text 2"

I'm using JSoup to achieve this.

Any ideas???

Thanks!

ollo · Accepted Answer

You can select the next Node (not Element!) of each div-tag. In your example they are all TextNode's.

final String html = "Text #1 "Another Text 1"
"
                  + "Text #2 "Another Text 2" ";

Document doc = Jsoup.parse(html);

for( Element element : doc.select("div.example") ) // Select all the div tags
{
    TextNode next = (TextNode) element.nextSibling(); // Get the next node of each div as a TextNode

    System.out.println(next.text()); // Print the text of the TextNode
}

Output:

 "Another Text 1" 
 "Another Text 2"

Extract text outside of a HTML Tag

Answers (2)

Related Questions