monoshiro
monoshiro

Reputation: 78

Tornado large file download

I'm experimenting with Content-Disposition on tornado. My code for reading and writing of file looks like this:

with open(file_name, 'rb') as f:
        while True:
            data = f.read(4096)
            if not data:
                break
            self.write(data)
    self.finish()

I expected the memory usage to be consistent since it is not reading everything at once. But the resource monitor shows:

In use    Available
12.7 GB   2.5GB

Sometimes it will even BSOD my computer...
How do I download a large file (say 12GB in size)?

Upvotes: 1

Views: 1844

Answers (1)

lanhao945
lanhao945

Reputation: 460

tornado 6.0 provide a api download large file may use like below:

import aiofiles

async def get(self):

    self.set_header('Content-Type', 'application/octet-stream')
    # the aiofiles use thread pool,not real asynchronous
    async with aiofiles.open(r"F:\test.xyz","rb") as f:
        while True:
            data = await f.read(1024)
            if not data:
                break
            self.write(data)
            # flush method call is import,it makes low memory occupy,beacuse it send it out timely
            self.flush()

just use aiofiles but not use the self.flush() may not solve the trouble.

just look at the method self.write():

def write(self, chunk: Union[str, bytes, dict]) -> None:
    """Writes the given chunk to the output buffer.

    To write the output to the network, use the `flush()` method below.

    If the given chunk is a dictionary, we write it as JSON and set
    the Content-Type of the response to be ``application/json``.
    (if you want to send JSON as a different ``Content-Type``, call
    ``set_header`` *after* calling ``write()``).

    Note that lists are not converted to JSON because of a potential
    cross-site security vulnerability.  All JSON output should be
    wrapped in a dictionary.  More details at
    http://haacked.com/archive/2009/06/25/json-hijacking.aspx/ and
    https://github.com/facebook/tornado/issues/1009
    """
    if self._finished:
        raise RuntimeError("Cannot write() after finish()")
    if not isinstance(chunk, (bytes, unicode_type, dict)):
        message = "write() only accepts bytes, unicode, and dict objects"
        if isinstance(chunk, list):
            message += (
                ". Lists not accepted for security reasons; see "
                + "http://www.tornadoweb.org/en/stable/web.html#tornado.web.RequestHandler.write"  # noqa: E501
            )
        raise TypeError(message)
    if isinstance(chunk, dict):
        chunk = escape.json_encode(chunk)
        self.set_header("Content-Type", "application/json; charset=UTF-8")
    chunk = utf8(chunk)
    self._write_buffer.append(chunk)

at the ending of the code:it just append the data you want send to the _write_buffer.

the data would be sent when the get or post method has finished and the finish method be called.

the document about tornado 's handler flush is :

http://www.tornadoweb.org/en/stable/web.html?highlight=flush#tornado.web.RequestHandler.flush

Upvotes: 2

Related Questions