Reputation: 851
I wanted to know the total size of a folder stored in S3 using AWS-SDK.
Note:-
I don't want to use any command or AWS console to find the size of my folder I wanted to do this by aws-sdk and I mentioned it above so please don't mark this as duplicate.
so far what I found on the internet is to list down all the objects of folder and iterate throw it and i do this and it's working fine. here is my code :-
import AWS from 'aws-sdk';
AWS.config.region = "BUCKET_REGION";
AWS.config.credentials = new AWS.CognitoIdentityCredentials({
IdentityPoolId: "COGNITO_ID",
});
let bucketName = "BUCKET_NAME"
let bucket = new AWS.S3({
params: {
Bucket: bucketName
}
});
bucket.listObjects({Prefix:"FOLDER_NAME",Bucket:"BUCKET_NAME"}, function (err, data) {
if (err) {
console.log(err)
} else {
console.log(data)
//data returns the array throw which I iterate and find the total size of the object
}
});
but what is the problem is that there is a point of time when my folder contains so many objects that it makes it hard to iterate each one of the elements in the list. it takes to much time to just calculate the size of the folder.
so I need a better way to calculate the size of folder and all I found is this command
aws s3 ls s3://myBucket/level1/level2/ --recursive --summarize | awk 'BEGIN{ FS= " "} /Total Size/ {print $3}'
is there any way I can do the above process throw my aws-sdk.
any kind of help is appreciated. thanks in advance
Upvotes: 2
Views: 6113
Reputation: 1019
Node.js code to get the folder size:
let totalSize = 0;
let continuationToken = null;
do {
const params = {
Bucket: bucketName,
Prefix: folderPrefix, // Folder path, e.g., 'my-folder/'
ContinuationToken: continuationToken, // For pagination
};
const data = await s3.listObjectsV2(params).promise();
totalSize += data.Contents.reduce((acc, obj) => acc + obj.Size, 0);
continuationToken = data.IsTruncated ? data.NextContinuationToken : null;
} while (continuationToken);
console.log(`Total Size: ${totalSize} bytes`);
console.log(`Total Size: ${(totalSize / (1024 * 1024)).toFixed(2)} MB`);
console.log(`Total Size: ${(totalSize / (1024 * 1024 * 1024)).toFixed(2)} GB`);
}
Upvotes: 0
Reputation: 470
This lambda method is pretty fast and it can work well for buckets with up to 100,000 objects if you are not concerned about a couple seconds delay. The AWS CLI has around the same performance because it seems to be using the same API, and S3 Metrics or Cloudwatch Stats might be more complicated to configure especially if you want to look only at specific folders.
Storing this in info in a database and triggering the method within intervals using flags is the way to go for small size buckets or folders.
const AWS = require('aws-sdk'), s3 = new AWS.S3()
exports.handler = async function (event) {
var totalSize = 0, ContinuationToken
do {
var resp = await s3.listObjectsV2({
Bucket: bucketName,
Prefix: `folder/subfolder/`,
ContinuationToken
}).promise().catch(e=>console.log(e))
resp.Contents.forEach(o=>totalSize+=o.Size)
ContinuationToken = resp.NextContinuationToken
} while (ContinuationToken)
console.log(totalSize) //your answer
}
Upvotes: 3
Reputation: 270144
It appears that your situation is:
Rather than listing objects and calculating sizes, I would recommend two alternatives:
Amazon S3 Inventory
Amazon S3 Inventory can provide a daily CSV file with details of all objects in a bucket. You could then take this data and calculate the total.
Amazon CloudWatch bucket metrics
Amazon CloudWatch has several metrics related to Amazon S3 buckets:
BucketSizeBytes
NumberOfObjects
I'm not sure how often those metrics are updated (they are not instant), but BucketSizeBytes
seems like it would be ideal for you.
If all else fails...
If the above two options do not meet your needs (eg you need to know the metrics "right now"), the remaining option would be to maintain your own database of objects. The database would need to be updated whenever an object is added or removed from the bucket (which can be done by using Amazon S3 Events to trigger an AWS Lambda function). You could then consult your own database to have the information available rather quickly.
Upvotes: 4