Reputation: 4640
Because of the huge hassles with finding a good scraping solution for Py, I'm using Dryscrape. I can't seem to get it to consistently work through a proxy, however. Some sites causes it to throw the following:
InvalidResponseError: Error while loading URL https://apis.google.com/js/plusone.js: Operation on socket is not supported (error code 99)
I guess it's some kind of proxy protection thingy, but I'm not breaking any TOS or anything. Only some sites do this, but the whole project is kind of relying on looking something up on the site daily. Does anyone have a solution?
Upvotes: 1
Views: 190
Reputation: 120
It's really hard to tell without any code and knowing what you are trying to accomplish. But if you are trying to scrape a lot of pages at once, try throttling back the # of current connections to your proxy. Does it occur on the same page(s) each attempt?
Upvotes: 1