Reputation: 17013
I'd like to make a tool which accesses a search engine programatically.
I've been enjoying using YQL recently and thought it might be useful since it can dig data out of HTML pages.
But I tried it with Google, Bing, and Yahoo search and they all seem to block YQL.
I wonder if there are some lesser-known web search sites that might work with YQL.
Or actually if there's still any search engine which offers an API that would be even better.
(In fact I'm only searching linguistics.stackexchange.com because the Stack Exchange APIs don't provide a way to search by text that I can find.)
Upvotes: 0
Views: 244
Reputation: 10721
Most search engine sites will block access from screen scrapers and other agents. YQL is designed to respect the robots.txt
file, so on many sites like this it won't work.
Instead, I suggest moving a step above HTML screen scraping and using a published search API.
In YQL for example, there is a table which provides access to the Bing search results:
select * from microsoft.bing where query="soccer" and source in ("web","image")
You could also look at the Yahoo! BOSS API or using the Bing Search API directly.
Upvotes: 1