Reputation: 21
Please bear with me, I'm self taught and not great at writing code.
I'm trying to scrape some data from a webpage in Google Sheets but the page "lazy loads" so not all of the data is there immediately upon page load. The delay is a few seconds. My preference would be to use the importxml formula but that was only returning partial results for this reason. Next I tried writing a script to do this because I thought I could use Utilities.sleep to make the script pause long enough for the rest of the info on the page to load, but I'm getting the same results as I did with importxml formula. I wonder if I'm just putting Utilities.sleep in the wrong place in the code? I thought I would need it after UrlFetchApp.fetch(url) but before the match logic but I think that doesn't work because the the fetch is already completed. Is there a way to add a 5-10 second pause to let the URL load before doing the fetch? Anyone know if there's a way to do this within the importxml formula itself?
Thanks so much for your time and consideration!
https://www.expeditions.com/destinations/alaska
function lowprice(url) {
var found, html, content = '';
var response = UrlFetchApp.fetch(url);
// Utilities.sleep(10*1000)
// this one pulls all price formatted numbers
//var regex = /(\$[0-9,]*)/g;
// the ?<= means only pull after that and the ?<! means only pull before that
var regex = /(?<="itinerary-card__price">)(\$[0-9,]*)(?<!<\/p>)/g
if (response) {
html = response.getContentText();
if (html) content = html.match(regex);
}
return content;
}
Upvotes: 0
Views: 430