Reputation: 5
Is it still possible to extract metadata like price from a site using ImportXML in Google Sheets?
I have tried multiple different variations on the following page without success: https://www.officedepot.com/a/products/273646/Office-Depot-White-Copy-Paper-Letter/
=IMPORTXML("https://www.officedepot.com/a/products/273646/Office-Depot-White-Copy-Paper-Letter/","//*[contains(@itemprop,'price')]/@content")
=IMPORTXML("https://www.officedepot.com/a/products/273646/Office-Depot-White-Copy-Paper-Letter/","//meta[@itemprop='price']/@content")
I should be able to use this formula to return "58.99", but I continuously receive an NA error.
Upvotes: 0
Views: 720
Reputation: 5905
OfficeDepot seems to block the requests from GoogleSheets. Some clues :
Get the price directly from the .json (we can't use ImportJSON
script to load this directly in Sheets since requests are blocked). Change the product id in the url accordingly :
Another option could be to use SerpAPI(commercial)+ImportJSON to fetch the product price from GoogleShooping.
Or you can use the GoogleSearch API(free)+ImportJson
. Output :
You need an API key and an ImportJSON script (credits to Brad Jasper) to do this. Once you have installed the script and activated your API key, add a search engine. In the settings, you have to define the target website.
Copy somewhere your search engine id (cx=XXXXXXXXXX). Once this is done and assuming you have the urls in column A, you can paste in cell B2 :
=REGEXEXTRACT(A2;"products\/(\d+)")
This is for extracting the product id.
In cell C2, you can paste :
="https://customsearch.googleapis.com/customsearch/v1?cx={yoursearchengineID}&key={yourAPIkey}&num=1&fields=items(pagemap(offer(price)))&q="&B2
We construct the request for the API. You need to add your API key and your search engine id in this formula.
In cell D2, you can paste :
=QUERY(ImportJSON(C2);"SELECT Col1 label Col1''";1)
This for importing the .json result and cleaning it a bit.
Notes : this method could fail with some products (the new ones). I'm based in Europe. So ";" in the formulas should be replaced with ",".
Upvotes: 1