Reputation: 4649
I have passed the HTML content to a string say"Html_content". I need to parse string "Html_content" and pick few DIV tags using there ID's say for example "fullHeader" is id of the DIV, i need to pick the content which is inside the div tag of "fullHeader" and store it a string.
I tried JSOUP but I need to save the collected div tags inside Document, But I need to save it as string, but its not possible using Jsoup, is there any alternative?
Upvotes: 3
Views: 581
Reputation: 577
JSoup is exactly what you need. What I understood is that you need have the HTML elements returned to you in String form so that you can further use them to create another document.
suppose you have the Element object say ele extracted out from the HTML.
Now write
String htmlForEle = new Element(Tag.valueOf("div")).append(ele.clone()).remove().html();
html for ele is exactly what you are looking for.
Upvotes: 3
Reputation: 1109372
But I need to save it as string, but its not possible using Jsoup
Wrong, Jsoup has an Element#text()
method for this.
String text = element.text(); // <div>foo<b>bar</b></div> will give "foobar"
// ...
Or when you want to include the HTML in the string as well, use Element.html()
or Element#outerHtml()
, depending on the requirement.
String html = element.html(); // <div>foo<b>bar</b></div> will give "foo<b>bar</b>"
// ...
or
String html = element.outerHtml(); // <div>foo<b>bar</b></div> will give exact this string
// ...
Upvotes: 3
Reputation: 689
If you force your HTML to XML syntax then you can use XPath, SAX, DOM, and other XML tools to manipulate the document.
Upvotes: 0