Cee Jay
Cee Jay

Reputation: 21

Writing content of a large file to Dynamo from S3 with Lambda

I have multiple large csv files in a S3 bucket. I want to write their data to a dynamoDB table. The issue is my function runs for more than 15 minutes and get the timeout error without completely writing the csv filevto DynamoDB. So is there a way to split the csv into smaller parts?

Things I've tried so far

this - This doesn't invoke itself as it is supposed to be(writes a few lines to the table then stops without any errors.
aws document - Gives s3fs module not found error. Tried many many things to make it work but couldn't.

Is there anyway I can do my task?

Thank You

Upvotes: 1

Views: 667

Answers (2)

Cee Jay
Cee Jay

Reputation: 21

I could fix my problem (partly) by increasing the writing capacity on dynamodb to 1000 minimum. I could write 1 million records in 10 minutes. Still I needed to split the csv file. Also using batch_write instead of writing each item line by line helps tremendously.

Upvotes: 1

tpschmidt
tpschmidt

Reputation: 2717

I think the fan-out approach from your linked solution should be the best option.

Take a main lambda function which will split the processing by dividing the number of lines (e.g. 1000 Lines each) into fan-out calls for your processing lambda, which will be invoked with Event instead of Tail. The processing lambdas should then only read the CSV lines assigned to it (have a look here).

If you already tried this, could you probably post parts of your solution?

Upvotes: 2

Related Questions