Reputation: 11
I'm trying to web scrape a information from Google and they aren't liking it. The vector contains 2487 Google sites and from which one of them I want to get the text of the first result.
I tried to create a loop to slow down the process but I'm very bad at it.
b is the value that contain all the web sites. First, I tried:
ContentScraper(b, CssPatterns = ".st") -> b
But then, I tried to loop and slow it down, but I have no idea how to.
b[i] <- ContentScraper(i, CssPatterns = ".st")}
From the 55th and on all that I get is the error. Any thoughts on how to avoid it? Thanks.
Upvotes: 0
Views: 1631
Reputation: 1087
One way is to use
Sys.sleep(...)
Another way if you're using puppeteer
or playwright
you can adjust the interval of the scrapes with celery beat
.
Is that what you're looking for?
Upvotes: 0