Change CSV file In S3 With AWS Lambda

Question

Is there a way to have the dynamodb rows for each user, backed up in s3 with a csv file.

Then using streams, when a row is mutated, change that row in s3 in the csv file.

The csv readers that are currently out there are geared towards parsing the csv for use within the lambda.

Whereas I would like to find a specific row, given by the stream and then replace it with another row without having to load the whole file into memory as it may be quite big. The reason I would like a backup on s3, is because in the future I will need to do batch processing on it and reading 300k files from dynamo within a short period of time, is not preferable.

avigil · Accepted Answer

Read the data from S3, parse as csv using your favorite library and update, then write back to S3:

import io
import boto3

s3 = boto3.resource('s3')
bucket = s3.Bucket('mybucket')

with io.BytesIO() as data:
    bucket.download_fileobj('my_key', data)

    # parse csv data and update as necessary
    # then write back to s3

    bucket.upload_fileobj(data, 'my_key')

Note that S3 does not support object append or update if that was what you were hoping for- see here. You can only read and overwrite. You might take this into account when designing your system.

Change CSV file In S3 With AWS Lambda

Answers (1)

Related Questions