Maduri
Maduri

Reputation: 279

How to remove white spaces between paragraph in Jsoup output paragraphs?

Here is my code. When output printing that print white space between paragraphs also. How can I remove white spaces between paragraphs and then I want to store sentence by sentence in array list.

    public static void main(String[] args) {

    try {
          String url = "http://www.divaina.com/";

          System.setProperty("http.proxyHost", "cache.mrt.ac.lk");
          System.setProperty("http.proxyPort", "3128");

          Document doc = Jsoup.connect(url).timeout(10000).get();

          Elements paragraphs = doc.select("p");
          for(Element p : paragraphs){
            System.out.println(p.text());}
                } 
        catch (IOException ex) {
            ex.printStackTrace();
           }


}

When I'm directly adding content into database white spaces also adding it. How can I remove those white spaces between paragraphs? Actually I want to read content of web page and line by line adding to the database. Is there any other proper way to do it?

Screen shot of out come

Upvotes: 0

Views: 1003

Answers (2)

Yevgen
Yevgen

Reputation: 1667

Obviously some of paragraphs contain no text. This might help:

for (Element p : paragraphs) 
{
    if (p.text().length() != 0)
    System.out.println(p.text());
}

Upvotes: 1

Lawrance
Lawrance

Reputation: 1055

Use regular expression:

String withoutspace = whitespace.replaceAll("\\s", "");

Or try this

String withoutSpace = whitespace.replace("\n", "").replace("\r", "");

Upvotes: 0

Related Questions