hnm
hnm

Reputation: 829

Help needed on web spider

I am writing a very basic web spider in java.I am facing one problem, that content loaded for same url is different than that in browser.For example try below URL.

http://www.google.co.in/search?sourceid=chrome&ie=UTF-8&q=web+spider#sclient=psy&hl=en&source=hp&q=web+spider&aq=f&aqi=&aql=&oq=web+spider&pbx=1&fp=d8e8e41d6d2bda33&biw=1366&bih=643

If you load this url in browser, and through JAVA URL class, the contents are different.This may be because of the following reasons.

So is there a way that I simulate browser in my java program.Are There any third party libraries, that loads the page similar to what browser does and finally return the content.Any help is appreciated.

Upvotes: 3

Views: 342

Answers (1)

Frederic Bazin
Frederic Bazin

Reputation: 1529

try htmlunit it can emulate browser behaviour and handle javascript

Upvotes: 1

Related Questions