Lothre1
Lothre1

Reputation: 3853

NodeJS: What's the most efficient way to read the last X bytes of a very large file (+1GB)?

I would like to efficiently read the last X bytes of a very large file using node.js. What's the most efficient way of doing so?

As far as I know the only way of doing this is by creating a read stream and loop until a hit the byte index.

Example:

// lets assume I want the last 10 bytes;
// I would open a stream and loop until I reach the end of the file
// Once I did I would go to the last 10 bytes I kept in memory 

let f = fs.createReadStream('file.xpto'); //which is a 1gb file
let data = [];

f.on('data', function(data){
    for (d of data){
        data.push(d)
        data = data.slice(1,11); //keep only 10 elements
    }

})
f.on('end', function(){
    // check data
    console.log('Last test bytes is', data)
})
f.resume();

Upvotes: 5

Views: 5548

Answers (3)

Justin Dalrymple
Justin Dalrymple

Reputation: 888

For a promised version of the read solution:

import FS from 'fs/promises';

async function getLastXBytesBuffer() {
  const bytesToRead = 1024; // The x bytes you want to read
  const handle = await FS.open(path, 'r');
  const { size } = await handle.stat(path)

  // Calculate the position x bytes from the end
  const position = size - bytesToRead; 

  // Get the resulting buffer
  const { buffer } = await handle.read(Buffer.alloc(bytesToRead), 0, bytesToRead, position);

  // Dont forget to close filehandle
  await handle.close()

  return buffer
}

Upvotes: 3

Lothre1
Lothre1

Reputation: 3853

Here's the sample code based on Arash Motamedi answer. This will let you read the last 10 bytes of a very large file in a few ms.

let fs = require('fs');

const _path = 'my-very-large-file.xpto';
const stats = fs.statSync(_path);

let size = stats.size;
let sizeStart = size-10;
let sizeEnd = size;


let options = {
    start: sizeStart,
    end: sizeEnd
}
let stream = fs.createReadStream(_path, options)
stream.on('data',(data)=>{
    console.log({data});
})
stream.resume()

Upvotes: 6

Arash Motamedi
Arash Motamedi

Reputation: 10682

You essentially want to seek to a certain position in the file. There's a way to do that. Please consult this question and the answers:

seek() equivalent in javascript/node.js?

Essentially, determine the starting position (using the file length from its metadata and the number of bytes you're interested in) and use one of the following approaches to read - as stream or via buffers - the portion you're interested in.


Using fs.read

fs.read(fd, buffer, offset, length, position, callback)

position is an argument specifying where to begin reading from in the file.


Using fs.createReadStream

Alternatively, if you want to use the createReadStream function, then specify the start and end options: https://nodejs.org/api/fs.html#fs_fs_createreadstream_path_options

fs.createReadStream(path[, options])

options can include start and end values to read a range of bytes from the file instead of the entire file.

Upvotes: 6

Related Questions