Reputation: 150739
I'm working on an AWS Lambda function (Node 4.3) that needs to run through all the items in a DynamoDB table and update certain attributes.
The problem I'm having is how to get Lambda to wait until all of the DynamoDB operations are finished.
var async = require('async');
var aws = require('aws-sdk');
var doc = new aws.DynamoDB.DocumentClient();
exports.handler = (event, context, callback) => {
doc.scan({
TableName: 'Occupations_dev'
}, function (err, data) {
console.log(data.Items.length);
var funcs = [];
data.Items.forEach(function (item) {
funcs.push(function (cb) {
item.Popularity = 0;
doc.put({
TableName: 'Occupations_dev',
Item: item
}, function (err, data) {
if (err) {
console.log("ERROR: " + item.Name);
cb(err);
} else {
console.log('Finished put for ' + item.Id)
cb(null, item);
}
});
});
});
async.parallel(funcs, function (err, results) {
console.log('Finished');
if (err) {
context.fail(err);
} else {
callback(null, 'Finished');
}
});
});
};
I tried using async.parallel
to wait for all of the db.put
requests to finish but it ends with a Process exited before completing request
error whenever the Lambda function runs.
It does update some of the DynamoDB items but definitely not all of them.
I added some console.log
calls when there are errors but the only output I see in the log is this:
START RequestId: b72fd7c6-14ed-11e7-a95a-c1185af4e870 Version: $LATEST
2017-03-30T02:08:11.691Z b72fd7c6-14ed-11e7-a95a-c1185af4e870 1362
END RequestId: b72fd7c6-14ed-11e7-a95a-c1185af4e870
REPORT RequestId: b72fd7c6-14ed-11e7-a95a-c1185af4e870 Duration: 37165.80 ms Billed Duration: 37200 ms Memory Size: 128 MB Max Memory Used: 128 MB
RequestId: b72fd7c6-14ed-11e7-a95a-c1185af4e870 Process exited before completing request
What's the proper way to make the Lambda function wait until everything is done? (It's not a huge amount of data so I'm not worried about running longer than 5 minutes and timing out.)
Upvotes: 3
Views: 3333
Reputation: 1488
The async.parallel
function invocations are occurring asynchronously, which is likely flooding DynamoDB with many simultaneous updates and raising "too many connections" errors at the DB level
I'd recommend using a synchronous operation, like async.series
to perform the DB updates. DynamoDB should have not trouble processing these updates one after the other.
Upvotes: 1
Reputation: 4446
The message "Process exited before completing request" means that the Javascript function exited before calling context.done (or context.succeed, etc.).
Here are some suggestions:
First of all, try increasing memory limit for the function. This line Memory Size: 128 MB Max Memory Used: 128 MB
may indicate that memory is not enough and the process just gets killed without calling last callback.
What you will probably see after increasing memory limit is one of the following:
Your function will timeout. In this case you may need to increase table's provisioned capacity (and/or your lambda timeout)
Even if the function ends without timeout, you will probably see that not all the table records are processed. This is because scan and query operations may return not all the table's rows, if the total number of scanned items exceeds the maximum data set size limit of 1 MB. When a scan completes, you should check if LastEvaluatedKey is returned along with the data. If it is, you should make another scan providing LastEvaluatedKey value as ExclusiveStartKey parameter
Upvotes: 1