Ashik Mohammed
Ashik Mohammed

Reputation: 1115

Delete multiple AWS S3 objects with version id

I'm trying to remove the current marker from AWS S3 in order to restore a previous version. I have plenty of files to delete. But when using the following command, yes or no prompt is triggered. Is there any way to avoid this prompt?

aws s3api list-object-versions --bucket my_bucket --output json --query 'DeleteMarkers[].[Key, VersionId]' --region us-east-1 | jq -r '.[] | "--key '\''" + .[0] + "'\'' --version-id " + .[1]' | xargs -p -L1 aws s3api delete-object --bucket my_bucket --region us-east-1

Following is the sample output for this script.

aws s3api delete-object --bucket my_bucket --region us-east-1 --key 58a7ec1f77dd6e3ff3024a65/1487498355//39.jpeg --version-id _VY88LfKX3tXy2JABbC6rhDzKl3xhkg0 ?...

Here we need to give "y" as the answer. Then next prompt will be triggered. The issue is I have 800,000 files to do the same operation.

Now I'm using the following command.

echo '#!/bin/bash' > undeleteScript.sh && aws --output text s3api list-object-versions --bucket my_bucket | grep -E "^DELETEMARKERS" | awk '{FS = "[\t]+"; print "aws s3api delete-object --bucket my_bucket --key \42"$3"\42 --version-id "$5";"}' >> undeleteScript.sh && . undeleteScript.sh; rm -f undeleteScript.sh;

But, this seems to be very slow. Any other commands to make this easy & quick?

Upvotes: 1

Views: 1998

Answers (2)

Ashik Mohammed
Ashik Mohammed

Reputation: 1115

I tried the second script posted in question during the off-peak time. 3 Gb data were restored per hour. Since the data was restored from the glacier, this was the maximum speed I got.

I'm posting the script again for reference.

echo '#!/bin/bash' > undeleteScript.sh && aws --output text s3api list-object-versions --bucket my_bucket | grep -E "^DELETEMARKERS" | grep -E "2017-09-11" | awk '{FS = "[\t]+"; print "aws s3api delete-object --bucket my_bucket --key \42"$3"\42 --version-id "$5";"}' >> undeleteScript.sh && . undeleteScript.sh; rm -f undeleteScript.sh;

where,

my_bucket = S3 bucket name

2017-09-11 = Files deleted date(This marker will be deleted and previous latest version will be restored).

Upvotes: 0

mhumesf
mhumesf

Reputation: 451

I would look at using a language other than bash to perform a large loop-like operation. Something like JS or Go.

var AWS = require('aws-sdk');
var s3 = new AWS.S3({apiVersion: '2006-03-01'});
var _ = require('underscore');

var params = {
    Bucket: "exampleBucket",
    Prefix: "exampleItem"
};

var getObjectsToDelete = s3.listObjectVersions(params).promise()

getObjectsToDelete.then(function(data) {
    console.log(data)
    return _.map(data.Versions, function(object){
        return _.pick(object, 'Key', 'VersionId')
    })
}).then(function(data) {
    var theDeleted = {
        Bucket: params.Bucket,
        Delete: {
            Objects: data
        }
    }
    s3.deleteObjects(theDeleted, function(err, data) {
        if (err) console.log(err, err.stack); // an error occurred
        else     console.log(data);
    })
})

Another option would be to setup a LifeCycle rule for object versions in the bucket. Something for NoncurrentVersionExpiration.

https://docs.aws.amazon.com/AmazonS3/latest/dev/DeletingObjectVersions.html

Upvotes: 1

Related Questions