How to download multi-part file from multiple servers

Question

I am working on a project that would require downloading a file that is stored on multiple servers in parts.

Requirements:

The solution must be with java-script on the client side.
It should support working with large files ~ 50G and more.
It should be fast and not crash the browser from memory overload.

Before inventing my own "bicycle" I just want to check if there any existing solutions. I didn't find one good enough on google and github search.

If there are no solutions like that, maybe some advice on the limits of the new file API. Is it even able to handle files that large?

josh3736 · Accepted Answer

Quite frankly, I seriously doubt you'll be able to pull this off.

For files of the size you're working with, you'll be much better off simply asking your end-users to install a BitTorrent client and distribute your downloads that way.

That said, a few roadblocks to consider:

There are two file-related APIs. The File object, but that's only for reading files selected by a or dropped via drag-n-drop.

The one you want is the FileSystem API, but there's one very important caveat: this API gives you a virtual filesystem whose contents are obscured from the user. In practical terms, this means that the files you write to disk will be stored in an obscure location unknown to the user (something like \Users\Me\AppData\Local\Chrome\User Data\Default\File System\000\), and the user must click a specially-constructed link that initiates the browser's normal file download mechanism (which in this case, means copying the file from the "virtual" filesystem to the user's Downloads folder.
As a consequence of being sandboxed to a virtual filesystem and having to copy the file to its destination, the user must have 2 * n bytes free. So I'd need 100 GB to download your 50 GB file.
Your virtual filesystem must request quota, and the user must approve the request before you can start writing. While the good news is that...
```
webkitStorageInfo.requestQuota(webkitStorageInfo.PERSISTENT, 53687091200);
```
...appeared to succeed for me, there's no guarantee that browsers will always allow requests for such large amounts of storage space.
You can write Blobs to your virtual filesystem with the FileEntry object. The documentation is incomplete, but I'd hope that you can write to arbitrary positions in the file.
XHR does not apparently allow you to stream response data. When you request XHR to give you a response as a Blob (a new feature), it must buffer the entire response in memory.

There are hacks that allow you to poll a XHR object for response data as it comes in, but the browser will necessarily buffer the entire response, even though you've already read previous bytes.

This means your individual file parts can be no larger than a couple megabytes. With an average HTTP request/response header overhead of 800 bytes - 1 kB, you're looking at an extra 50 MB just in HTTP headers over the wire of the course of 50 GB. (I know .1% is a tiny amount of overhead, it's just something to consider.)

Again, don't do it. Use the right tool for the job, which in this case is BitTorrent. I'd imagine that somewhere out there is a standalone BT client that you can configure to automatically start downloading a preconfigured torrent. So a user would just click the download link, start the EXE, and be on their way.

How to download multi-part file from multiple servers

Answers (1)

Related Questions