Craig
Craig

Reputation: 47

Web scraping through web pages using JSoup

Ive made a web scraper to scraper pieces of information on the IMDB. It traversed each page by changing the number in the url to a different random one and then repeated the web scraping process on this new page.

http://www.imdb.com/title/tt0800369/ <--Changing this number for a new movie.

How can I do this on the BFI website? I cant see a way to go from film to film.

Thanks in advance!

Upvotes: 0

Views: 727

Answers (1)

Damian
Damian

Reputation: 3050

Following randomly generated links is not the most efficient way to traversed over WWW... You really should follow URL's that you found on other pages. You can use crawler4j that seems to be easiest Java crawler to start with. There are also some alternatives.

Upvotes: 1

Related Questions