ZioByte
ZioByte

Reputation: 2994

How to limit download speed in python3 requests?

I am using requests to download a large (~50MiB) file on a small embedded device running Linux.

File is to be written to attached MMC.

Unfortunately MMC write speed is lower then net speed and I see memory consumption raise and, in a few cases I even had kernel "unable to handle page..." error.

Device has only 128MiB RAM.

The code I'm using is:

            with requests.get(URL,  stream=True) as r:
                if r.status_code != 200:
                    log.error(f'??? download returned {r.status_code}')
                    return -(offset + r.status_code)
                siz = 0
                with open(sfn, 'wb') as fo:
                    for chunk in r.iter_content(chunk_size=4096):
                        fo.write(chunk)
                        siz += len(chunk)
                return siz

How can I temporarily stop server while I writer to MMC?

Upvotes: 1

Views: 1916

Answers (3)

Anonymous1847
Anonymous1847

Reputation: 2598

You can try decreasing the size of the TCP receive buffer with this bash command:

echo 'net.core.rmem_max=1000000' >> /etc/sysctl.conf

(1 MB, you can tune this)

This stops there being a huge buffer build-up at this stage of the process.

Then write code to only read from the TCP stack and write to the MMC at specified intervals to prevent buffers from building up elsewhere in the system, such as the MMC write buffer -- for example @e3n's answer.

Hopefully this should cause packets to be dropped and then re-sent by the server once the buffer opens up again.

Upvotes: 1

fuenfundachtzig
fuenfundachtzig

Reputation: 8352

If the web server supports the http Range field, you can request a download of only part of the large file and then step through the entire file part by part.

Take a look at this question, where James Mills gives the following example code:

from requests import get

url = "http://download.thinkbroadband.com/5MB.zip"
headers = {"Range": "bytes=0-100"}  # first 100 bytes

r = get(url, headers=headers)

As your problem is memory, you will want to stop the server from sending you the whole file at once, as this will certainly be buffered by some code on your device. Unless you can make requests drop part of the data it receives, this will always be a problem. Additional buffers downstream of requests will not help.

Upvotes: 1

SaGaR
SaGaR

Reputation: 542

                if r.status_code != 200:
                    log.error(f'??? download returned {r.status_code}')
                    return -(offset + r.status_code)
                siz = 0
                with open(sfn, 'wb') as fo:
                    for chunk in r.iter_content(chunk_size=4096):
                        fo.write(chunk)
                        siz += len(chunk)
                return siz

You can rewrite it as a coroutine

import requests

def producer(URL,temp_data,n):
    with requests.get(URL,  stream=True) as r:
        if r.status_code != 200:
            log.error(f'??? download returned {r.status_code}')
            return -(offset + r.status_code)
        for chunk in r.iter_content(chunk_size=n):
            temp_data.append(chunk)
            yield #waiting to finish the consumer
            

def consumer(temp_data,fname):
    with open(fname, 'wb') as fo:
        while True:
            while len(temp_data) > 0:
                for data in temp_data:
                    fo.write(data)
                    temp_data.remove(data) # To remove it from the list
                    # You can add sleep here
                    yield #waiting for more data


def coordinator(URL,fname,n=4096):
    temp_data = list()
    c = consumer(temp_data,fname)
    p = producer(URL,temp_data,n)
    while True:
        try:
            #getting data
            next(p)
        except StopIteration:
            break
        finally:
            #writing data
            next(c)

These are all the functions you need. To call this

URL = "URL"
fname = 'filename'
coordinator(URL,fname)

Upvotes: 1

Related Questions