Elliott B
Elliott B

Reputation: 1199

How to handle with rate limiting with requests_futures?

I have been using Python requests to get data from an API, but I want to speed it up by running asynchronously with requests_futures. I am only allowed 200 API requests per minute, so I have to check for this and wait a specified number of seconds before retrying. This number is returned in the Retry-After header. Here is the original working code:

  session = requests.Session()
  for id in ticketIds:
    url = 'https://colorfront.zendesk.com/api/v2/tickets/' + str(id) + '/comments.json'
    req = requests.get(url, auth=zd_secret)

    if req.status_code == 429:
      time.sleep(int(req.headers['Retry-After']))
      req = requests.get(url, auth=zd_secret)

    comments += req.json()['comments']

The following asynchronous code works until it hits a rate limit, then all the requests after that fail.

session = FuturesSession()
  futures = {}
  for id in ticketIds:
    url = 'https://colorfront.zendesk.com/api/v2/tickets/' + str(id) + '/comments.json'
    futures[id] = session.get(url, auth=zd_secret)

  for id in ticketIds:
    comments += futures[id].result().json()['comments']

When I hit the rate limit, I need a way to retry only the requests which failed. Does requests_futures have some built-in way to handle this?

Update: The requests_futures library does not have anything built-in for this. I found this related open issue: https://github.com/ross/requests-futures/issues/26. I'll try to pace the requests up front since I know the API limit, but that won't help if another user from my organization is simultaneously hitting the same API.

Upvotes: 3

Views: 3309

Answers (2)

Elliott B
Elliott B

Reputation: 1199

I think I have found a solution. I don't know if it's the best way, but it avoids another dependency. I can play with max_workers and x simultaneous requests to optimize efficiency depending on the internet speed at this coffee shop.

session = FuturesSession(max_workers=2)
futures = {}
res = {}
delay = 0
x = 200
while ticketIds:
  time.sleep(delay)
  if len(ticketIds) > x - 1:
    for id in ticketIds[:x]:
      url = 'https://colorfront.zendesk.com/api/v2/tickets/' + str(id) + '/comments.json'
      futures[id] = session.get(url, auth=zd_secret)
  else:
    for id in ticketIds:
      url = 'https://colorfront.zendesk.com/api/v2/tickets/' + str(id) + '/comments.json'
      futures[id] = session.get(url, auth=zd_secret)

  # use a copy of the list
  for id in ticketIds[:]:
    if id in futures:
      res[id] = futures[id].result()
      # remove successful IDs from list
      if res[id].status_code == 200:
        ticketIds.remove(id)
        comments += res[id].json()['comments']
      else:
        delay = int(res[id].headers['Retry-After'])

Upvotes: 1

moebius
moebius

Reputation: 2259

You should be able to use the Retry module from urllib3.util.retry to achieve this:

from requests_futures.sessions import FuturesSession
from requests.adapters import HTTPAdapter
from urllib3.util.retry import Retry

session = FuturesSession()
retries = 5
status_forcelist = [429]
retry = Retry(
     total=retries,
     read=retries,
     connect=retries,
     respect_retry_after_header=True,
     status_forcelist=status_forcelist,
)

adapter = HTTPAdapter(max_retries=retry)
session.mount('http://', adapter)
session.mount('https://', adapter)

futures = {}
for id in ticketIds:
    url = 'https://colorfront.zendesk.com/api/v2/tickets/' + str(id) + '/comments.json'
    futures[id] = session.get(url, auth=zd_secret)

for id in ticketIds:
    comments += futures[id].result().json()['comments']

Upvotes: 3

Related Questions