revy
revy

Reputation: 4707

Incrementally upload zip data from memory to a single file on AWS S3 bucket using Node.js streams

I have a ReadableStream of uncompressed text data that I need to store on a single zip-compressed file on S3.

What I don't want to do:

What I would like to do:

What I have tried so far:

import {S3} from 'aws-sdk';
import {PassThrough} from 'stream';
import archiver from 'archiver';

const s3 = new S3({region: 'us-east-1'});

(async () => {
    const stream = new PassThrough();
    const archive = archiver('zip', {
        zlib: { level: 9 }
    });
    archive.pipe(stream);

    const upload = s3.upload({
        Bucket: 'my-bucket',
        Key: 'stream.zip',
        Body: stream,
    }).promise();

    // Assume this is the readable stream we are reading from
    for (let i = 0; i < 100; i++) {
        const textData = `text-data-${i}`;
        archive.append(Buffer.from(`${textData}\n`, 'utf8'), { name: `file-${i}.txt` });
    }

    await archive.finalize();
    await upload;
})();

This is not correct because it is generating multiple files in the output archive stream.zip on S3 (file-1.txt, file-2.txt, etc). On the other hand, if I use a single file name when appending data to archive, I need to buffer all the data into memory before appending which nullifies the purpose of streaming data incrementally.

Does anyone know any solution to this?

Upvotes: 0

Views: 110

Answers (1)

revy
revy

Reputation: 4707

I have possibly found a solution, basically appending a Stream instead of a Buffer to the archiver:

import {S3} from 'aws-sdk';
import {PassThrough} from 'stream';
import archiver from 'archiver';

const s3 = new S3({region: 'us-east-1'});

(async () => {
    const inputStream = new PassThrough();
    const outputStream = new PassThrough();

    const archive = archiver('zip', {
        zlib: { level: 9 }
    });
    archive.pipe(outputStream);
    // Append a stream instead of a Buffer. This should result in a single zipped file.
    archive.append(inputStream, {name: 'myfile.txt'});

    const upload = s3.upload({
        Bucket: 'my-bucket',
        Key: 'stream.zip',
        Body: outputStream,
    }).promise();
    
    for (let i = 0; i < 100; i++) {
        const textData = `text-data-${i}\n`;
        inputStream.write(textData, 'utf-8');
    }

    inputStream.end();
    await Promise.all([
        archive.finalize(),
        upload,    
    ]);
    outputStream.end();    
})();

Would be glad to hear opinions about this.

Upvotes: 0

Related Questions