Dany M
Dany M

Reputation: 133

How To Read Big Files in NodeJS?

I'm trying to read a 20 million lines file and correct the line endings from windows to mac. I know it can be done in sed but sed gives me an error that I don't know how to fix (dos2unix: Binary symbol 0x0008 found at line 625060). So I'm trying to fix this in NodeJS. Here's my code:

var fs = require('fs');
var eol = require('eol');

//read file
var input = fs.readFileSync(process.argv[2], 'utf8');

//fix lines
output = eol.auto(input);
console.log("Lines Fixed! Now Writing....")

//write file
fs.writeFile(process.argv[2] + '_fixed.txt', output, function (err) {
  if (err) return console.log(err); 
});
console.log("Done!")

Problem is the file is too big and I get this error buffer.js:513 throw new Error('"toString()" failed');

Upvotes: 5

Views: 19809

Answers (2)

Lazyexpert
Lazyexpert

Reputation: 3154

You shouldn't do it synchronously. The best way to deal with big data is streams:

let output = '';

const readStream = fs.createReadStream(filename);

readStream.on('data', function(chunk) {
  output += eol.auto(chunk.toString('utf8'));
});

readStream.on('end', function() {
  console.log('finished reading');
  // write to file here.
});

Upvotes: 8

LF-DevJourney
LF-DevJourney

Reputation: 28529

For reading very big files, you'd better not read the whole file into memory, you can read the file by lines or by chunks. On how to read big file by lines or by chunks with nodejs refer to my answer here of this node.js: read a text file into an array. (Each line an item in the array.).

Upvotes: 0

Related Questions