Reputation: 10934
From your experience, how difficult do you think it takes to programmatically search for a term in the Yellow Pages website and then scrape off the contact information from the results into a CSV file?
Upvotes: 1
Views: 128
Reputation: 33
Using Perl and some modules like WWW::Robot will probably be not that hard. I didn't try, but since you know Python, Scrapy might help. http://scrapy.org
Remember not to hammer the site when you crawl, because your IP can get banned.
Upvotes: 1
Reputation: 53881
With the right modules and libraries its very do-able! It depends on your tools though, Perl or Python and you'll be all set. If you're trying to do this with C++ You may have a bit more pain heading your way.
If you provide more information about your situation (language frameworks constraints) I can be more specific.
Also there are legal issues to consider with scraping, I am not sure of the Yellow Pages policy on bots. Read their robots.txt before proceeding. http://www.robotstxt.org/ should give you some starting information about learning about this stuff.
The best way to be both safe and legal is to just use the API, http://developer.yp.com/
Upvotes: 0
Reputation: 405745
Can you just use the YP Search API? Access is free, and it only takes a minute to set up a developer account.
Upvotes: 2