Reputation: 53
hi i am completing a little hobby project of mine to create a small scale search engine.
i was wondering if any one knows of a decent robust opensource web crawler that they have used? it should be easy for a noob to setup and use.
thank you for not googling web crawlers and pasting a list .
Upvotes: 4
Views: 360
Reputation: 309
I think you should read a similar experience.
http://infolab.stanford.edu/~backrub/google.html
Upvotes: 0
Reputation: 54144
crawler4j is a pretty decent crawler, multi-threaded and easy to configure and use. It's written in Java.
You can find a list of open-source crawlers in this wikipedia page.
Upvotes: 2