Reputation: 2994
I am using requests to download a large (~50MiB) file on a small embedded device running Linux.
File is to be written to attached MMC.
Unfortunately MMC write speed is lower then net speed and I see memory consumption raise and, in a few cases I even had kernel "unable to handle page..." error.
Device has only 128MiB RAM.
The code I'm using is:
with requests.get(URL, stream=True) as r:
if r.status_code != 200:
log.error(f'??? download returned {r.status_code}')
return -(offset + r.status_code)
siz = 0
with open(sfn, 'wb') as fo:
for chunk in r.iter_content(chunk_size=4096):
fo.write(chunk)
siz += len(chunk)
return siz
How can I temporarily stop server while I writer to MMC?
Upvotes: 1
Views: 1916
Reputation: 2598
You can try decreasing the size of the TCP receive buffer with this bash command:
echo 'net.core.rmem_max=1000000' >> /etc/sysctl.conf
(1 MB, you can tune this)
This stops there being a huge buffer build-up at this stage of the process.
Then write code to only read from the TCP stack and write to the MMC at specified intervals to prevent buffers from building up elsewhere in the system, such as the MMC write buffer -- for example @e3n's answer.
Hopefully this should cause packets to be dropped and then re-sent by the server once the buffer opens up again.
Upvotes: 1
Reputation: 8352
If the web server supports the http Range
field, you can request a download of only part of the large file and then step through the entire file part by part.
Take a look at this question, where James Mills gives the following example code:
from requests import get
url = "http://download.thinkbroadband.com/5MB.zip"
headers = {"Range": "bytes=0-100"} # first 100 bytes
r = get(url, headers=headers)
As your problem is memory, you will want to stop the server from sending you the whole file at once, as this will certainly be buffered by some code on your device. Unless you can make requests drop part of the data it receives, this will always be a problem. Additional buffers downstream of requests
will not help.
Upvotes: 1
Reputation: 542
if r.status_code != 200:
log.error(f'??? download returned {r.status_code}')
return -(offset + r.status_code)
siz = 0
with open(sfn, 'wb') as fo:
for chunk in r.iter_content(chunk_size=4096):
fo.write(chunk)
siz += len(chunk)
return siz
You can rewrite it as a coroutine
import requests
def producer(URL,temp_data,n):
with requests.get(URL, stream=True) as r:
if r.status_code != 200:
log.error(f'??? download returned {r.status_code}')
return -(offset + r.status_code)
for chunk in r.iter_content(chunk_size=n):
temp_data.append(chunk)
yield #waiting to finish the consumer
def consumer(temp_data,fname):
with open(fname, 'wb') as fo:
while True:
while len(temp_data) > 0:
for data in temp_data:
fo.write(data)
temp_data.remove(data) # To remove it from the list
# You can add sleep here
yield #waiting for more data
def coordinator(URL,fname,n=4096):
temp_data = list()
c = consumer(temp_data,fname)
p = producer(URL,temp_data,n)
while True:
try:
#getting data
next(p)
except StopIteration:
break
finally:
#writing data
next(c)
These are all the functions you need. To call this
URL = "URL"
fname = 'filename'
coordinator(URL,fname)
Upvotes: 1