edA-qa mort-ora-y
edA-qa mort-ora-y

Reputation: 31851

Python requests, how to limit received size, transfer rate, and/or total time?

My server does external requests and I'd like to limit the damage a failing request can do. I'm looking to cancel the request in these situations:

Note I am not looking for the timeout parameter in requests, as this is a timeout only for inactivity. I'm unable to find anything to do with a total timeout, or a way to limit the total size. One example shows a maxsize parameter on HTTPAdapter but that is not documented.

How can I achieve these requirements using requests?

Upvotes: 22

Views: 39832

Answers (2)

iampritamraj
iampritamraj

Reputation: 316

Its works for me

import requests

response = requests.get(your_url, stream=True, timeout=10)
response_content = [] #contains partial or full page_source 

for chunk in response.iter_content(1024):
    if len(chunk)>10000: # you can decide your chunk size limit(page_size)
       response_content.append(chunk)
       response.close()
       break
     else:
         response_content.append(chunk) # has full page source
         break
               

Upvotes: -2

Martijn Pieters
Martijn Pieters

Reputation: 1121366

You could try setting stream=True, then aborting a request when your time or size limits are exceeded while you read the data in chunks.

As of requests release 2.3.0 the timeout applies to streaming requests too, so all you need to do is allow for a timeout for the initial connection and each iteration step:

r = requests.get(..., stream=True, timeout=initial_timeout)
r.raise_for_status()

if int(r.headers.get('Content-Length')) > your_maximum:
    raise ValueError('response too large')

size = 0
start = time.time()

for chunk in r.iter_content(1024):
    if time.time() - start > receive_timeout:
        raise ValueError('timeout reached')

    size += len(chunk)
    if size > your_maximum:
        raise ValueError('response too large')

    # do something with chunk

Adjust the timeout as needed.

For requests releases < 2.3.0 (which included this change) you could not time out the r.iter_content() yield; a server that stops responding in the middle of a chunk would still tie up the connection. You'd have to wrap the above code in an additional timeout function to cut off long-running responses early.

Upvotes: 28

Related Questions