Parsing HTML with jsoup didn't retrieve expected results

Question

I'm doing a small parser to get some data about diseases in CDC webpage. I'm using jsoup, and everything seems to be working ok except this.

I've four example urls which I've parsed to obtain the link to the "section" that contains the data that I want (see code).

If you see the code of each page you will check that these links exists.

After obtain this link (internal link) and try to retrieve the "element" object that with this value I found that it is working in two of the four pages, and I don't know the reason.

Here is my code:

import org.jsoup.Jsoup;
import org.jsoup.nodes.Document;
import org.jsoup.select.Elements;

public class MainJSoupTest {

public MainJSoupTest() {
    try {
        test("http://www.cdc.gov/HAI/organisms/bCepacia.html", "#a3");
        test("http://www.cdc.gov/meningitis/bacterial.html", "#symptoms");
        test("http://www.cdc.gov/nczved/divisions/dfbmd/diseases/botulism/", "#symptoms");
        test("http://www.cdc.gov/getsmart/antibiotic-use/URI/bronchitis.html", "c");
    } catch (Exception e) {
        e.printStackTrace();
    }
}

private void test(String url, String element) throws Exception {
    Document doc = Jsoup.connect(url).get();
    Elements els = doc.select(element);
    System.out.println(" ---- Test -----");
    System.out.println("URL: " + url);
    System.out.println("Element: " + element);
    System.out.println("Size: " + els.size());
}

public static void main(String[] args) {
    new MainJSoupTest();
}

}

And the output:

  ---- Test -----
 URL: http://www.cdc.gov/HAI/organisms/bCepacia.html
 Element: #a3
 Size: 1
  ---- Test -----
  URL: http://www.cdc.gov/meningitis/bacterial.html
  Element: #symptoms
 Size: 0
  ---- Test -----
 URL: http://www.cdc.gov/nczved/divisions/dfbmd/diseases/botulism/
 Element: #symptoms
 Size: 1
 ---- Test -----
 URL: http://www.cdc.gov/getsmart/antibiotic-use/URI/bronchitis.html
 Element: c
 Size: 0

As you could see, the size for two of the pages is 1 (as expected, there is an element which represents the internal link). However, the other two returns 0.

Any though?

Parsing HTML with jsoup didn't retrieve expected results

Answers (1)

Related Questions

Parsing HTML with jsoup didn&#39;t retrieve expected results

Answers (1)

Related Questions

Parsing HTML with jsoup didn't retrieve expected results