Joshua
Joshua

Reputation: 125

how to remove tags with jsoup but keep given tags

How to remove all tags except <p> and <img> with jsoup?

<div>
  <p>hello world
    <span>good</span>
    <img src="/src/img/beauty.jpg"/>
    welcome
  </p>
</div>

Should become

<p>hello world
    good
    <img src="/src/img/beauty.jpg"/>
    welcome
  </p>

Upvotes: 1

Views: 328

Answers (1)

Michael Powers
Michael Powers

Reputation: 2050

You're going to want to look at the Cleaner.clean() method. You'll specify a Whitelist of tags you want to allow.

Example from jsoup.org:

String unsafe = 
    "<p><a href='http://example.com/' onclick='stealCookies()'>Link</a></p>";
String safe = Jsoup.clean(unsafe, Whitelist.basic());
    // now: <p><a href="http://example.com/" rel="nofollow">Link</a></p>

Upvotes: 1

Related Questions