Gil Peretz
Gil Peretz

Reputation: 2419

Jsoup returns inconsistent results

I'm trying to use Jsoup in order to scrape the following url:

http://translink.com.au//travel-information/service-notices/25611/details

I used the following query #content-left-column > div.content but the results are inconsistent.

Sometimes I get no results, and sometimes I get the required results.

public class JsoupSelectorMain {
public static Elements getAlertsElements(Document document , String query)
{
    return document.select(query);
}

public static void main(String args[]) throws ParseException {
    Document doc = null;
    try {
        doc = Jsoup.connect("http://translink.com.au//travel-information/service-notices/25611/details").get();
    } catch (IOException e) {
        e.printStackTrace();
    }
    String str="#content-left-column > div.content";
    Elements element = getAlertsElements(doc, str);

    for(int i=0 ; i<element.size() ; i++){
        System.out.println(element.get(i).toString());
        System.out.println();
    }

    System.out.println("size=" + element.size());
}

}

I used timeout(0) but it is not the issue. I also checked Jsoup known issues but couldn't find similar cases.

What i'm missing here?

Upvotes: 1

Views: 228

Answers (1)

Tanmay
Tanmay

Reputation: 3159

I think its because the site detects it as a mobile user agent and perhaps that's what causing the inconsistency in your results. I created a new project on eclipse and in debug mode I found that the URL was changed to http://mobile.translink.com.au//travel-information/service-notices/25611/details

Here is the screenshot: without assigning .userAgent

And then I changed this statement:

doc = Jsoup.connect("http://translink.com.au//travel-information/service-notices/25611/details").timeout(0).get();

To this:

doc = Jsoup.connect("http://translink.com.au//travel-information/service-notices/25611/details").timeout(0).userAgent("Chrome").get();

...So that it can detect it as Non-mobile/Desktop UA.

After adding the user agent: enter image description here

Upvotes: 1

Related Questions