Reputation: 65
I tried to get all objects in AWS S3 bucket using @aws-sdk/client-s3. Instead of getting 1000 object the program exits after 50 object downloads.
In while() loop the obj_list.Contents.length
is equal to 1000 but
the process exits after receiving responses of 50 GetObjectCommand objects.
import { S3Client, ListObjectsV2Command, GetObjectCommand } from "@aws-sdk/client-s3"
(async () => {
const client = new S3Client({
credentials:{
accessKeyId:'XXXXXXXXXXXXXXXXXXXXX',
secretAccessKey:'XXXXXXXXXXXXXXXXXXXXX'
},
region: "us-east-1"
})
const input = {
Bucket: 'Bucket-Name'
}
const cmd = new ListObjectsV2Command(input)
const obj_list = await client.send(cmd)
let i = 0
while (i < obj_list.Contents.length) {
const command = new GetObjectCommand({
Bucket: 'Bucket-Name',
Key: obj_list.Contents[i++].Key
})
client.send(command)
.then(
(data) => {
console.log(`Content length: ${data.ContentLength}`)
},
(error) => {
const { requestId, cfId, extendedRequestId } = error.$$metadata
console.log(`Error: ${requestId}, ${cfId}, ${extendedRequestId}`)
}
)
}
console.log("Done")
})();
console.log("End")
Here is the output in Visual Studio Code console:
C:\Program Files\nodejs\node.exe .\test.js
End
Done
50
Content length: 38535294
What are the possible reasons of that?
UPD. Here is the code which creates array of Promises 10 by 10, resolve these Promises then create another slice. No difference - after 50 requests the script exits with status 13: "Process exited with code 13".
The statuses of all resolved Promises are 'fulfilled'.
// <list> contains all objects fom the bucket as in the code above
// ...
const step = 50
let i = 0
while (i < obj_list.Contents.length) {
const to = Math.min(i + step, obj_list.Contents.length)
let promises = []
for (let f = i; f < to; ++f) {
promises.push(client.send(
new GetObjectCommand({
Bucket: 'Bucket',
Key: obj_list.Contents[f].Key
})
))
}
const statuses = await Promise.allSettled(promises)
i = to
}
This code exits on await Promise.all(promises)
with exit code 13:
const promises = obj_list.Contents.map(async (obj_cont) => {
const command = new GetObjectCommand({
Bucket: 'Bucket',
Key: obj_cont.Key
})
const data = await client.send(command)
});
const statuses = await Promise.all(promises)
Terminal output:
C:\Program Files\nodejs\node.exe .\async_batch.js
Process exited with code 13
Upvotes: 0
Views: 3264
Reputation: 140
I posted the solution to this same problem in another thread, so here is a link for it.
Basically, adding a new Agent
and passing it, along with the keepAlive: false
option when instantiating the S3()
does the job. Here is an example of what it looks like:
import { Agent } from "https";
const s3 = new S3({
requestHandler: {
httpsAgent: new Agent({ keepAlive: false })
}
});
I also used the @supercharge/promise-pool
module to create a Pool of Promises and execute 50 at a time, not to reach the limit of socket connections. It looks something like this:
const filenames = [] // array with Keys to the S3 Objects I aimed to download
const POOL_LIMIT = 50;
const downloadAndZipSingleFile = async (Key) => {
const response = await s3.send(new GetObjectCommand({ Bucket, Key }));
const body = await response.body.transformToString();
// whatever other operations you need...
}
await PromisePool
.for(filenames)
.withConcurrency(POOL_LIMIT)
.process(downloadAndZipSingleFile);
Upvotes: 0
Reputation: 31
I ran in to this exact problem today and it took me ages to work out what could be causing such odd behaviour, as my script was also exiting after 50 executions of the loop with error "Process exited with code 13".
I've used the AWS S3 Client many times in the past for long running processes and never hit this problem before.
What got me to the answer in the end was inspecting the response from the GetObjectCommand
, where I realised that the 'Body' content of the response was left unresolved. In this instance, I wasn't actually after the content of the object, just the metadata and it seems that there is a limit to the number of requests you can leave unresolved before it exits (I'm sure someone cleverer that me can explain that better).
The solution for me was to simply read the response.body
content (even though I wasn't going to use it) inside the loop as follows:
const response = await s3Client.send(
new GetObjectCommand({
Bucket: s3Bucket,
Key: key,
}),
);
const body = await response.Body.transformToString();
The script was then able to execute to completion. Hope this helps someone.
Upvotes: 3
Reputation: 2985
I created a working example on my GitHub. It can make it easier for you to compare/debug the code.
Example of how to use the AWS SDK v3 S3 client to upload files to S3 in parallel or sequence execution.
npm install
to install the dependencies.BUCKET
environment variable to a random bucket name.node bucket-create.js
to create the bucket.node bucket-delete.js
to delete the bucket.Using Promise.all
to upload files in parallel.
Using while
loop to upload files in parallel.
This is happening because you are using the callback version of the Promise object within the loop:
while (i < obj_list.Contents.length) {
const command = // ...
client.send(command)
.then(
(data) => {
// callback
},
(error) => {
// callback
}
)
}
The loop finishes before the JS event loop fires all the .then
callbacks.
As suggested by @Ankush Jain, you can use await
to resolve the promise:
while (i < obj_list.Contents.length) {
try {
const command = // ...
const data = await client.send //... code here
console.log(`Content length: ${data.ContentLength}`);
} catch (error) {
const { requestId, cfId, extendedRequestId } = error.$$metadata
//... code here
}
}
This will trigger them sequentially.
If you need to perform the requests in parallel, you can create an array of promises and use Promise.all
or Promise.allSettled
to await for them.
const promises = obj_list.Contents.map(async () => {
const command = // ...
const data = await client.send //... code here
}); // array of promises
await Promise.all(promises)
Upvotes: 0
Reputation: 7079
It seems, you are missing await
before the send
method.
await client.send(command);
Refer this example - List objects in an Amazon S3 bucket using an AWS SDK
Upvotes: 1