Reputation: 19905
Is there any way in JSoup
to join
two or more elements in memory - i.e., in the Document
tree, without producing the raw HTML
string?
For example, the following HTML div
element with some nested tags
<div>This is text with <custom>a custom nested tag</custom> and some <other>text within a tag</other>, all of which should become part of the top-level </div>.
would be transformed into
<div>This is text with a custom nested tag and some text within a tag, all of which should become part of the top-level </div>.
Essentially, the nested tags in the example above have been deleted but their content has remained, as if a string replace()
operation had been run on the raw HTML
, before parsed into a Document
object by JSoup
.
The overall operation could be coded like this:
public static method splice(Document document, List<String> tags) {
for (String tag : tags) {
// Find the tag node (Element) in the tree
// Remove the tag node and join its content with its parent
}
}
Upvotes: 0
Views: 134
Reputation: 2941
Jsoup's upwrap() function is what you're looking for. It removes the element but keeps the children elements.
Upvotes: 1