user1508972
user1508972

Reputation: 21

Jsoup returns html different from web browser

I have a url = "http://mp3.zing.vn/tim-kiem/playlist.html?q=Bảo Thy

Document doc = Jsoup.connect(url).get()

when I use Jsoup to get html. It doesn't work right. It returns the html which is different from when I use a browser. How can I solve this problem.

However, When I use url without parameters (http://mp3.zing.vn), It works right.

Upvotes: 0

Views: 1083

Answers (3)

Saeid Farivar
Saeid Farivar

Reputation: 1687

I had the same issue and I fixed it by

Document doc = Jsoup.connect("YourURL").userAgent("Mozilla").get();

Upvotes: 1

user1508972
user1508972

Reputation: 21

I have solve this problem.

http://mp3.zing.vn/tim-kiem/playlist.html?q=Bảo thy

Parameter is vietnamese word and in this case, this site uses URL encoding. So that, I have to encode all of parameters to UTF-8 encoding.

keyword = URLEncoder.encode(keyword,"UTF-8");

and the url after encode

http://mp3.zing.vn/tim-kiem/playlist.html?q=B%E1%BA%A3o%20thy

Jsoup has worked right.

Thanks for all. Close Topic.

Upvotes: 2

Martin Revert
Martin Revert

Reputation: 3292

It is very possible that you will need to provide a cookie, session or some kind a registration method.

Please, check this:

Advice with crawling web site content

Upvotes: 1

Related Questions