RustyShackleford
RustyShackleford

Reputation: 3677

How to stagger asynchronous API calls to prevent Max retries with grequests library?

I have a list of ~250K urls for an API that I need to retrieve.

I have made a class using grequests that works exactly how I want it to except, I think it is working too fast because after running through the entire list of URLs I get error:

Problem: url: HTTPSConnectionPool(host='url', port=123): Max retries exceeded with url: url (Caused by NewConnectionError('<urllib3.connection.VerifiedHTTPSConnection object at 0x38f466c18>: Failed to establish a new connection: [Errno 8] nodename nor servname provided, or not known',))

Code so far:

import grequests

lst = ['url','url2',url3']

class Test:
    def __init__(self):
        self.urls = lst

    def exception(self, request, exception):
        print ("Problem: {}: {}".format(request.url, exception))

    def async(self):
        return grequests.map((grequests.get(u) for u in self.urls), exception_handler=self.exception, size=5)


    def collate_responses(self, results):
        return [x.text for x in results]

test = Test()
#here we collect the results returned by the async function
results = test.async()

How can slow the code down a bit to prevent the 'Max retries error'? Or even better how can I chunk the list I have and pass the URLs in chunks?

Using python3.6 on mac.

edit:

question is not duplicate, have to pass in many URLs to the same endpoint.

Upvotes: 0

Views: 1326

Answers (1)

Ryan Z
Ryan Z

Reputation: 70

try replacing the greqeusts.map with a loop and adding sleep

for u in self.urls:
  req = grequests.get(u)
  job = grequests.send(req)
  sleep(5)

similar issue resolved with sleep

Upvotes: 1

Related Questions