Reputation: 1248
I'm doing some HTML
parsing using Jsoup
in Android and I encountered some weird stuff(for me). Some sites work fine using the simple Jsoup.connect(String).get()
method.
But in some other sites I ALWAYS get an EOFException
. So i search and tackled the userAgent
property , and when i use Jsoup.connect(String).userAgent("Mozilla").get()
it work just fine.
Now what exactly that "mozilla" means? That my app could work only with devices that has Mozilla
installed?
Upvotes: 0
Views: 1501
Reputation: 16508
Every time your web browser opens a web page, it sends a "request" for that page. Part of that request includes a series of "headers". Suppose you are using Firefox to open google then something like this will be sent to google
Host www.google.com
User-Agent Mozilla/5.0 (Windows NT 6.1; WOW64; rv:45.0) Gecko/20100101 Firefox/45.0
Accept text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8
Accept-Language de,en-US;q=0.7,en;q=0.3
Accept-Encoding gzip, deflate
.... HTML responses in return is optimized for desktops / crawlers or in general for the browser (User-Agent) you are using. Sometimes it is totally different from the html response which is sent back for example to mobile agents (Android). The mobile version may not have anything that matches your Jsoup selectors. So with userAgent("Mozilla").get() your are telling that you wish to get the same response for your application which you see when you open google.com with a desktop browser.
Upvotes: 3