cecemel
cecemel

Reputation: 636

FileReader: reading many files with javascript without memory leaks

In a web page, I have to read a small part of a file, this for many (1500 - 12000) small files each being approx 1 Mb big. Once I collected the information I need, I push it back to the server.

My problem: I use the FileReader API, garbage collect does not work and memory consumption explodes.

Code goes as:

function extract_information_from_files(input_files) {

//some dummy implementation
for (var i = 0; i < input_files.length; ++i) {


    (function dummy_function(file) {

        var reader = new FileReader();

        reader.onload = function () {

            //convert to Uint8Array because used library expects this

            var array_buffer = new Uint8Array(reader.result);

            //do some fancy stuff with the library (very small subset of data is kept)

            //finish

            //function call ends, expect garbage collect to start cleaning.
            //even explicit dereferencing does not work
        };

        reader.readAsArrayBuffer(file);

    })(input_files[i]);

}

}

Some remarks:

Last strange detail (posted for completeness), when using FileReader combined with https://gildas-lormeau.github.io/zip.js/, where I read a File just before pushing it to a zip archive, garbage collecting just works.

All these remarks, seem to point towards me being unable to use FileReader as it should, so please tell me how.

Upvotes: 6

Views: 4002

Answers (1)

m4ktub
m4ktub

Reputation: 3121

The problem may be related with the order of execution. In your for loop you are reading all files with reader.readAsArrayBuffer(file). This code will run before any onload is run for a reader. Depending on the browser implementation of FileReader this can mean the browser loads the entire file (or simply preallocates the buffer for the entire file) before any onload is called.

Try to process files like a queue and see if it makes a difference. Something like:

function extract_information_from_files(input_files) {
    var reader = new FileReader();

    function process_one() {
        var single_file = input_files.pop();
        if (single_file === undefined) {
            return;
        }

        (function dummy_function(file) {
            //var reader = new FileReader();

            reader.onload = function () {
                // do your stuff
                // process next at the end
                process_one();
            };

            reader.readAsArrayBuffer(file);
        })(single_file);
    }

    process_one();
}

extract_information_from_files(file_array_1);
// uncomment next line to process another file array in parallel
// extract_information_from_files(file_array_2);

EDIT: It seems that browsers expect you to reuse FileReaders. I've edited the code to reuse a single reader and tested (in chrome) that the memory usage stays limited to the largest file you read.

Upvotes: 3

Related Questions