markvgti
markvgti

Reputation: 4619

Preserving the <br> tags when cleaning with Jsoup

For the input text:

<p>Arbit string <b>of</b><br><br>text. <em>What</em> to <strong>do</strong> with it?

I run the following code:

Whitelist list = Whitelist.simpleText().addTags("br");
// Some other code...
// plaintext is the string shown above
retVal = Jsoup.clean(plaintext, StringUtils.EMPTY, list,
            new Document.OutputSettings().prettyPrint(false));

I get the output:

Arbit string <b>of</b>

text. <em>What</em> to <strong>do</strong> with it?

I don't want Jsoup to convert the <br> tags to line breaks, I want to keep them as-is. How can I do that?

Upvotes: 1

Views: 1169

Answers (2)

luksch
luksch

Reputation: 11712

This is not reproducible for me. Using Jsoup 1.8.3 and this code:

String html = "<p>Arbit string <b>of</b><br><br>text. <em>What</em> to <strong>do</strong> with it?";
String cleaned = Jsoup.clean(html, 
        "", 
        Whitelist.simpleText().addTags("br"),
        new Document.OutputSettings().prettyPrint(false));
System.out.println(cleaned);

I get the following output:

Arbit string <b>of</b><br><br>text. <em>What</em> to <strong>do</strong> with it?

Your problem must be somewhere else I guess.

Upvotes: 0

Try this:

Document doc2deal = Jsoup.parse(inputText);
doc2deal.select("br").append("br"); //or append("<br>")

Upvotes: 2

Related Questions