AWS Lambda unzip gzip file w/o saving the file on local

Question

I'm trying to get a file from a S3 bucket (.gzip) and unzip it to another bucket. I couldn't find a way to do it without saving the file on local (my PC). Is there a way to 'save' the file on Lambda and unzip it directly on S3? Thank you!

johni · Accepted Answer

Here's a lambda code for example (gist):

let path = require('path');
let aws = require('aws-sdk');
let s3Client = new aws.S3();
let zlib = require('zlib');
let s3s = require('s3-streams');

const output_bucket = "stackoverflow-bucket";

exports.handler = (event, context, callback) => {
    context.callbackWaitsForEmptyEventLoop = false;

    event.Records.forEach(record => {
        const params = {
            Bucket: record.s3.bucket.name,
            Key: record.s3.object.key
        };

        const isGzip = path.extname(params.Key) === ".gz";
        let readStream = s3Client.getObject(params).createReadStream();

        readStream = isGzip ? readStream.pipe(zlib.createGunzip()) : readStream;
        writeStream = s3s.WriteStream(s3Client, { Bucket: output_bucket, Key: path.basename(params.Key, ".gz") });

        // begins the actual streaming
        readStream.pipe(writeStream);

        writeStream.on('end', () => {
            callback(null, `Handled ${JSON.stringify(params)}`);
        });
    });
};

Note that this code uses a 3rd party library for streaming bytes to S3 (which is not natively supported by the Node.JS SDK).

For that, the documentation page here, which describes how you should package your lambda before uploading it to AWS.

You can set a S3 event to trigger your lambda whenever a new file is put to your source bucket:

AWS Lambda unzip gzip file w/o saving the file on local

Answers (1)

Related Questions