General Disorder
General Disorder

Reputation: 23

JSoup Parse HTML for Webview

I need to display a part of a page in Android Studio's Webview, the section containing the PDFs. This is the website I need https://www.limerick.ie/council/weekly-planning-lists and the part I want to show is this https://i.sstatic.net/6HfsL.png When I try to run my code, the Webview doesn't display anything and comes up blank.

Here is my code

package com.example.john_000.jsouptest;

import android.app.Activity;
import android.os.Bundle;
import android.webkit.WebView;
import org.jsoup.Jsoup;
import org.jsoup.nodes.Document;
import org.jsoup.nodes.Element;
import org.jsoup.select.Elements;
import java.io.IOException;

public class MainActivity extends Activity {
 public class HtmlParserActivity extends Activity {
    @Override
    public void onCreate(Bundle savedInstanceState) {
        super.onCreate(savedInstanceState);
        setContentView(R.layout.activity_main);

        WebView cardapio = (WebView) findViewById(R.id.webView);
        cardapio.getSettings().setJavaScriptEnabled(true);
        String data = "";
        Document doc = null;
        try {
            doc = Jsoup.connect("https://www.limerick.ie/council/weekly-planning-lists").get();
            Elements elements = doc.getElementsByClass("block-inner clearfix");
            for (Element element : elements) {
                data += element.outerHtml();
                data += "<br/>";
            }
            cardapio.loadData(data, "text/html", "UTF-8");
        } catch (IOException e) {
            e.printStackTrace();
        }
    }
 }
}

If anybody knows how to parse this HTML so that I only show the required table your help would be greatly appreciated.

Upvotes: 0

Views: 2544

Answers (2)

Jonas Czech
Jonas Czech

Reputation: 12328

This is not really specific to Android (don't have my android device handy), but this works on Java:

String url = "https://www.limerick.ie/council/weekly-planning-lists";

Document document = Jsoup.connect(url).get();
Element div = document.select("table.sticky-enabled").first();

String text = div.outerHtml();
System.out.println(text);

And it produces the following output:

<table class="sticky-enabled"> 
 <thead>
  <tr>
   <th>Attachment</th>
   <th>Size</th> 
  </tr>
 </thead> 
 <tbody> 
  <tr class="odd">
   <td><span class="file"><img class="file-icon" alt="PDF icon" title="application/pdf" src="/modules/file/icons/application-pdf.png"> <a href="https://www.limerick.ie/sites/default/files/260216_applications_refused.pdf" type="application/pdf; length=6526" title="260216_applications_refused.pdf">26/02/16 Applications Refused</a></span></td>
   <td>6.37 KB</td> 
  </tr> 
  <tr class="even">
   <td><span class="file"><img class="file-icon" alt="PDF icon" title="application/pdf" src="/modules/file/icons/application-pdf.png"> <a href="https://www.limerick.ie/sites/default/files/260216_applications_granted.pdf" type="application/pdf; length=20585" title="260216_applications_granted.pdf">26/02/16 Applications Granted</a></span></td>
   <td>20.1 KB</td> 
[...]

So in your code, you can replace

Elements elements = doc.getElementsByClass("block-inner clearfix");
for (Element element : elements) {
    data += element.outerHtml();
    data += "<br/>";
}

With

data = doc.select("table.sticky-enabled").first().outerHtml();

Which would get you the complete table.

And your data String will contain the complete HTML of the table, which you can then load into the WebView as before. Note that if you load raw HTML into a WebView like this, it will not have any formatting or styling, since the stylesheets (CSS) are not loaded.

If it doesn't work:

  • Make sure your WebView is visible in your layout.

  • Make sure you've added the "Internet" permission to your AndroidManifest.xml.

  • Look at the LogCat (see here), and see if you there are any exceptions, especially NetworkOnMainThreadException (Which you're probably be getting, see here.)

Let me know if it works, and if it doesn't, I'll try on an Android device and see.

Upvotes: 0

Navid Shakibapour
Navid Shakibapour

Reputation: 555

Replace your try-catch block with this one:

try {
    doc = Jsoup.connect("https://www.limerick.ie/council/weekly-planning-lists").get();
    Elements elements = doc.select("div.block-inner.clearfix");
    for (Element element : elements) {
        if (!element.select("tbody").isEmpty()) {
            data = element.outerHtml() + "<br/>";
            break;
        }
    }
    cardapio.loadData(data, "text/html", "UTF-8");
} catch (IOException e) {
    e.printStackTrace();
}

Upvotes: 1

Related Questions