Robert Green MBA
Robert Green MBA

Reputation: 1866

Trouble with Google Custom Search API

Need: Search google via the API and get a json result that mimmics the result found when I search on the webapi

My Custom Search settings is to search the Entire Web

My search resutns: Search Term: 072745546181 (which is a UPC label for some Chicken Breasts) https://www.googleapis.com/customsearch/v1?key=AIzaSyBaPxycT3gj82T5qm66XGgIvtSEP31LISo&cx=015261035819156121642:qj7jmhlymjw&q=072745546181

Web search returns (see results) Search Term: 072745546181 (which is a UPC label for some Chicken Breasts) Example 1: https://www.google.de/webhp?sourceid=chrome-instant&ion=1&espv=2&ie=UTF-8#q=072745546181

Notice the q= at the end is the same q=072745546181

There must be a simple answer, what am I doing wrong here?

Conversely, searching with terms like "Donald Trump President" https://www.googleapis.com/customsearch/v1?key=AIzaSyBaPxycT3gj82T5qm66XGgIvtSEP31LISo&cx=015261035819156121642:qj7jmhlymjw&q=donald trump president

returns an okay result I can do something with. No problem here, but why when searching UPC's it fails?

What should I do?

Update 1.26.17 - Added 50 Point Bounty, I can make more. What is normal rate? Need some help!

Upvotes: 4

Views: 2283

Answers (2)

Rogala
Rogala

Reputation: 2773

This is an old one, but still relevant. You have to create a custom search engine to look for UPCs here: https://cse.google.com/all

Once you do that, you need to add sites to search (e.g. https://www.barcodespider.com, https://www.upcitemdb.com)

enter image description here

From there, you cURL looks like this:

curl -X GET \
  'https://www.googleapis.com/customsearch/v1?key={{googleApiKey}}&cx={{googleUpcSearchEngineCode}}&q=034449787178' \
  -H 'Accept: */*' \
  -H 'Cache-Control: no-cache' \

The request will filter the results by searching for the UPC within the sites specified.

Upvotes: 0

SpliFF
SpliFF

Reputation: 38956

This is a fascinating question. I just ran a series of tests that confirms that keywords are treated oddly if they contain more than 8 numerical digits (even if seperated by whitespace or hyphens). They are not simply ignored - because this SO page is found - but most websites are not returned. My best guess for this behaviour is it is a deliberate filter put in by google to restrict numerical searches to "trusted" websites in order to prevent phone number lookups. It might even be a more aggressive move to limit UPC, government records and patent lookups so automated tools can't compete with current or planned Google services that do the same.

I experimented with all sorts of tests including advanced operators like inurl%3A072745546181, allintitle%3A0727+4554+6181 and targeting sites that appear in the regular search like url%3Abuycott.com+072745546181 and the behaviour is consistent. It is so consistent that it has to be deliberate.

I'd say with 95% certainty you can't do what you want with Custom Search and it's highly unlikely Google will provide you a workaround.

I would suggest trying another search API provider, maybe Bing Web Search API or Faroo or one of these product search APIs

Upvotes: 1

Related Questions