Courage
Courage

Reputation: 543

Async write file million times cause out of memory

Below is code:

var fs = require('fs')

for(let i=0;i<6551200;i++){
    fs.appendFile('file',i,function(err){

    })
}

When I run this code, after a few seconds, it show:

FATAL ERROR: CALL_AND_RETRY_LAST Allocation failed - JavaScript heap out of memory

and yet nothing in file!

my qusetion is :

  1. why is no byte in file?
  2. where cause out of memory?
  3. how to async write file in for loop no mater how large the write times?

thanks advance.

Upvotes: 0

Views: 1766

Answers (3)

Neil Lunn
Neil Lunn

Reputation: 151112

Bottom line here is that fs.appendFile() is an asynchronous call and you simply are not "awaiting" that call to complete on each loop iteration. This has a number of consequences, including but not limited to:

  • The callbacks keep getting allocated before they are resolved, which results in the "heap out of memory" eventually being reached.

  • You are contesting with a file handle, since you the function you are employing is actually opening/writing/closing the file given, and if you don't wait for each turn to do so, then you're simply going to clash.

So the simple solution here is to "wait", and some modern syntax sugar makes that easy:

const fs = require('mz/fs');

const x = 6551200;

(async function() {
  try {
    const fd = await fs.open('file','w');
    for (let i = 0; i < x; i++) {
      await fs.write(fd, `${i}\n`);
    }
    await fs.close(fd);
  } catch(e) {
    console.error(e)
  } finally {
    process.exit();
  }
})()

That will of course take a while, but it's not going to "blow up" your system whilst it does it's work.

The very first simplified thing is to just get hold of the mz library, which already wraps common nodejs libraries with modernized versions of each function supporting promises. This will help clean up the syntax a lot as opposed to using callbacks.

The next thing to realize is what was mentioned about that fs.appendFile() in how it is "opening/writing/closing" all in one call. That's not great, so what you would typically do is simply open and then write the bytes in a loop, and when that is complete you can actually close the file handle.

That "sugar" comes in modern versions, and though "possible" with plain promise chaining, it's still not really that manageable. So if you don't actually have a nodejs environment that supports that async/await sugar or the tools to "transpile" such code, then you might alternately consider using the asyncjs libary with plain callbacks:

const Async = require('async');
const fs = require('fs');

const x = 6551200;

let i = 0;
fs.open('file','w',(err,fd) => {
  if (err) throw err;

  Async.whilst(
    () => i < x,
    callback => fs.write(fd,`${i}\n`,err => {
      i++;
      callback(err)
    }),
    err => {
      if (err) throw err;
      fs.closeSync(fd);
      process.exit();
    }
  );

});

The same base principle applies as we are "waiting" for each callback to complete before continuing. the whilst() helper here allows iteration until the test condition is met, and of course does not do the next iteration until data is passed to the callback of the iterator itself.

There are other ways to approach this, but those are probably the two most sane for a "large loop" of iterations. Common approaches such as "chaining" via .reduce() are really more suited to a "reasonable" sized array of data you already have, and building arrays of such sizes here has inherent problems of it's own.

For instance, the following "works" ( on my machine at least ) but it really consumes a lot of resources to do it:

const fs = require('mz/fs');
const x = 6551200;

fs.open('file','w')
  .then( fd =>
    [ ...Array(x)].reduce(
      (p,e,i) => p.then( () => fs.write(fd,`${i}\n`) )
      , Promise.resolve()
    )
    .then(() => fs.close(fd))
  )
  .catch(e => console.error(e) )
  .then(() => process.exit());

So that's really not that practical to essentially build such a large chain in memory and then allow it to resolve. You could put some "governance" on this, but the main two approaches as shown are a lot more straightforward.

For that case then you either have the async/await sugar available as it is within current LTS versions of Node ( LTS 8.x ), or I would stick with the other tried and true "async helpers" for callbacks where you were restricted to a version without that support


You can of course "promisify" any function with the last few releases of nodejs right "out of the box" as it where, as Promise has been a global thing for some time:

const fs = require('fs');

await new Promise((resolve, reject) => fs.open('file','w',(err,fd) => {
  if (err) reject(err);
  resolve(fd);
});

So there really is no need to import libraries just to do that, but the mz library given as example here does all of that for you. So it's really up to personal preferences on bringing in additional dependencies.

Upvotes: 1

desoares
desoares

Reputation: 861

1 - The file is empty because none of the fs.append calls have ever finished, the Node.JS process broken before.

2 - The Node.JS heap memory is limited and stores the callback until it returns, not only the "i" variable.

3 - You could try to use promises to do that.

"use strict";

const Bluebird = require('bluebird');
const fs = Bluebird.promisifyAll(require('fs'));

let promisses = [];
for (let i = 0; i < 6551200; i++){
    promisses.push(fs.appendFileAsync('file', i + '\n'));
}

Bluebird.all(promisses)
.then(data => {
  console.log(data, 'End.');
})
.catch(e => console.error(e));

But no logic can avoid heap memory error for a loop this big. You could increase Node.JS Heep Memory or, the reasonable way, take chunks of data for interval:

'use strict';

const fs = require('fs');

let total = 6551200;

let interval = setInterval(() => {
  fs.appendFile('file', total + '\n', () => {});
  total--;
  if (total < 1) {
    clearInterval(interval);
  }
}, 1);

Upvotes: 0

Droid
Droid

Reputation: 90

Javascript is a single threaded language, which means your code can execute one function at the time. So when you execute an async function, it will be "queued" in the stack to be executed next.

so in your code, you are sending 6551200 calls to the stack, which would of course crash your app before starting working "appendFile" on any of them.

you can achieve what you want by splitting your loop into smaller loops, use async and await functions, or iterators.

if what you are trying to achieve is as simple as your code, you can use the following:

const fs = require("fs");

function SomeTask(i=0){
    fs.appendFile('file',i,function(err){
        //err in the write function
        if(err) console.log("Error", err);
        //check if you want to continue (loop)
        if(i<6551200) return SomeTask(i);
        //on finish
        console.log("done");
    });
}
SomeTask();

In the above code, you write a single line, and when that is done, you call the next one. This function is just for basic usage, it needs a refactor and use of Javascript Iterators for advanced usage check out Iterators and generators on MDN web docs

Upvotes: 0

Related Questions