svakili
svakili

Reputation: 2189

Delete all items in a DynamoDB table using bash with both partition and sort keys

I'm trying to delete all items in a DynamoDB table that has both partition and sort keys using AWS CLI in bash. The best thing I've found so far is:

aws dynamodb scan --table-name $TABLE_NAME --attributes-to-get "$KEY" \
--query "Items[].$KEY.S" --output text | \
tr "\t" "\n" | \
xargs -t -I keyvalue aws dynamodb delete-item --table-name $TABLE_NAME \
--key "{\"$KEY\": {\"S\": \"keyvalue\"}}"

But this does not work with a table that has both the partition key and the sort key, and I have not yet been able to make it work with such a table. Any idea how to modify the script to make it work for a table with composite keys?

Upvotes: 19

Views: 68015

Answers (8)

Cheruvian
Cheruvian

Reputation: 5867

Depending on the size of your table this can be too expensive and result in downtime. Remember that deletes cost you the same as a write, so you'll get throttled by your provisioned WCU. It would be much simpler and faster to just delete and recreate the table.

# this uses jq but basically we're just removing 
# some of the json fields that describe an existing 
# ddb table and are not actually part of the table schema/defintion
aws dynamodb describe-table --table-name $table_name | jq '.Table | del(.TableId, .TableArn, .ItemCount, .TableSizeBytes, .CreationDateTime, .TableStatus, .ProvisionedThroughput.NumberOfDecreasesToday)' > schema.json
# delete the table
aws dynamodb delete-table --table-name $table_name
# create table with same schema (including name and provisioned capacity)
aws dynamodb create-table --cli-input-json file://schema.json

If you really want to you can delete each item individually and you're on the right track you just need to specify both the hash and range keys in your scan projection and delete command.

aws dynamodb scan \
  --attributes-to-get $HASH_KEY $RANGE_KEY \
  --table-name $TABLE_NAME --query "Items[*]" \
  # use jq to get each item on its own line
  | jq --compact-output '.[]' \
  # replace newlines with null terminated so 
  # we can tell xargs to ignore special characters 
  | tr '\n' '\0' \
  | xargs -0 -t -I keyItem \
  # use the whole item as the key to delete (dynamo keys *are* dynamo items)
aws dynamodb delete-item --table-name $TABLE_NAME --key=keyItem

If you want to get super fancy you can use the describe-table call to fetch the hash and range key to populate $HASH_KEY and $RANGE_KEY but i'll leave that as an exercise for you.

Upvotes: 39

kfir yahalom
kfir yahalom

Reputation: 51

If you like to achieve that w/o writing specific scripts, for 1 time semi-manual (1 clicked for auto pagination deletion) and if you have permission to manage that from the was console.

you can use this Chrome Extension: https://chromewebstore.google.com/detail/dynamodb-auto-delete/oiicbadbamlhijkjildbhamionnbdaia?authuser=0&hl=en&pli=1

Upvotes: 0

ilromape
ilromape

Reputation: 21

I've used some examples here and created an code which actually take the parameters, delete, and re-create the table... working fine:

TABLE_NAME='<your_table_name>' ;\
aws dynamodb describe-table --table-name $TABLE_NAME \
|jq '.Table + .Table.BillingModeSummary
|del(.TableId, .TableArn, .ItemCount, .TableSizeBytes, .CreationDateTime, 
.TableStatus, .ProvisionedThroughput, .BillingModeSummary, 
.LastUpdateToPayPerRequestDateTime, .GlobalSecondaryIndexes[].IndexStatus, 
.GlobalSecondaryIndexes[].IndexSizeBytes, 
.GlobalSecondaryIndexes[].ItemCount, .GlobalSecondaryIndexes[].IndexArn, 
.GlobalSecondaryIndexes[].ProvisionedThroughput)' > tb_schema.json
aws dynamodb delete-table --table-name $TABLE_NAME
aws dynamodb create-table --cli-input-json file://tb_schema.json

Upvotes: 0

Alan Cooney
Alan Cooney

Reputation: 47

I've created a node module to do this:

https://www.npmjs.com/package/dynamodb-empty

yarn global add dynamodb-empty
dynamodb-empty --table tableName

Upvotes: 3

gildniy
gildniy

Reputation: 3913

From the @Adel and @codeperson answers here I made a function using Amplify CLI (With Hello World template), where the table name has to be passed using the event object:

/* Amplify Params - DO NOT EDIT
    API_DEALSPOON_GRAPHQLAPIENDPOINTOUTPUT
    API_DEALSPOON_GRAPHQLAPIIDOUTPUT
Amplify Params - DO NOT EDIT */

const AWS = require('aws-sdk')
const environment = process.env.ENV
const region = process.env.REGION
const apiDealspoonGraphQLAPIIdOutput = process.env.API_DEALSPOON_GRAPHQLAPIIDOUTPUT

exports.handler = async (event) => {

    const DynamoDb = new AWS.DynamoDB.DocumentClient({region});

    // const tableName = "dev-invite";
    // const hashKey = "InviteToken";
    let {tableName, hashKey} = event
    
    tableName = `${tableName}-${apiDealspoonGraphQLAPIIdOutput}'-'${environment}`
    
    // Customization 4: add logic to determine which (return true if you want to delete the respective item)
    // If you don't want to filter anything out, then just return true in this function (or remove the filter step below, where this filter is used)
    const shouldDeleteItem = (item) => {
        return item.Type === "SECURE_MESSAGE" || item.Type === "PATIENT";
    };

    const getAllItemsFromTable = async (lastEvaluatedKey) => {
        const res = await DynamoDb.scan({
            TableName: tableName,
            ExclusiveStartKey: lastEvaluatedKey
        }).promise();
        return {items: res.Items, lastEvaluatedKey: res.LastEvaluatedKey};
    };

    const deleteAllItemsFromTable = async (items) => {
        let numItemsDeleted = 0;
        // Split items into patches of 25
        // 25 items is max for batchWrite
        await asyncForEach(split(items, 25), async (patch, i) => {
            const requestItems = {
                [tableName]: patch.filter(shouldDeleteItem).map(item => {
                    numItemsDeleted++;
                    return {
                        DeleteRequest: {
                            Key: {
                                [hashKey]: item[hashKey]
                            }
                        }
                    };
                })
            };
            if (requestItems[tableName].length > 0) {
                await DynamoDb.batchWrite({RequestItems: requestItems}).promise();
                console.log(`finished deleting ${numItemsDeleted} items this batch`);
            }
        });

        return {numItemsDeleted};
    };

    function split(arr, n) {
        const res = [];
        while (arr.length) {
            res.push(arr.splice(0, n));
        }
        return res;
    }

    async function asyncForEach(array, callback) {
        for (let index = 0; index < array.length; index++) {
            await callback(array[index], index, array);
        }
    }

    let lastEvaluatedKey;
    let totalItemsFetched = 0;
    let totalItemsDeleted = 0;

    console.log(`------ Deleting from table ${tableName}`);

    do {
        const {items, lastEvaluatedKey: lek} = await getAllItemsFromTable(lastEvaluatedKey);
        totalItemsFetched += items.length;
        console.log(`--- a group of ${items.length} was fetched`);

        const {numItemsDeleted} = await deleteAllItemsFromTable(items);
        totalItemsDeleted += numItemsDeleted;
        console.log(`--- ${numItemsDeleted} items deleted`);

        lastEvaluatedKey = lek;
    } while (!!lastEvaluatedKey);

    console.log("Done!");
    console.log(`${totalItemsFetched} items total fetched`);
    console.log(`${totalItemsDeleted} items total deleted`);
};

Upvotes: 0

Adel
Adel

Reputation: 191

If you are interested in doing it with Node.js, have a look at this example (I'm using TypeScript here). Further related infos can be found in the AWS docs.

import AWS from 'aws-sdk';
const DynamoDb = new AWS.DynamoDB.DocumentClient({
region: 'eu-west-1'

});
export const getAllItemsFromTable = async TableName => {
   const Res = await DynamoDb.scan({ TableName }).promise();
   return Res.Items;
};

export const deleteAllItemsFromTable = async (TableName = '', items:{ id: string }, hashKey) => {
  var counter = 0;
  //split items into patches of 25
  // 25 items is max for batchWrite
  asyncForEach(split(items, 25), async (patch, i) => {
    const RequestItems = {
      TableName: patch.map(item => {
        return {
          DeleteRequest: {
            Key: {
              id: item.id
            }
          }
        };
      })
    };
    await DynamoDb.batchWrite({ RequestItems }).promise();
    counter += patch.length;
    console.log('counter : ', counter);
  });
};

function split(arr, n) {
  var res = [];
  while (arr.length) {
    res.push(arr.splice(0, n));
  }
  return res;
}

async function asyncForEach(array, callback) {
  for (let index = 0; index < array.length; index++) {
    await callback(array[index], index, array);
  }
}

const tableName = "table"
// assuming table hashKey is named "id"
deleteAllItemsFromTable(tableName,getAllItemsFromTable(tableName))

Upvotes: 7

Bert Hutzler
Bert Hutzler

Reputation: 9

We had some tables with indexes, so there had to be deleted some fields more, additionally the ".ProvisionedThroughput.LastDecreaseDateTime". Was a little work to fiddle out as i'm totally new to jq ;-) But this is how it worked for us:

    aws dynamodb describe-table --table-name $table_name | jq '.Table | del(.TableId, .TableArn, .ItemCount, .TableSizeBytes, .CreationDateTime, .TableStatus, .LatestStreamArn, .LatestStreamLabel, .ProvisionedThroughput.NumberOfDecreasesToday, .ProvisionedThroughput.LastIncreaseDateTime, .ProvisionedThroughput.LastDecreaseDateTime, .GlobalSecondaryIndexes[].IndexSizeBytes, .GlobalSecondaryIndexes[].ProvisionedThroughput.NumberOfDecreasesToday, .GlobalSecondaryIndexes[].IndexStatus, .GlobalSecondaryIndexes[].IndexArn, .GlobalSecondaryIndexes[].ItemCount)' > schema.json

Upvotes: 0

CodePredator
CodePredator

Reputation: 415

TO correct what @Cheruvian has posted. The following commands work, there are few more fields we need to exclude while creating schema.json.

aws dynamodb describe-table --table-name $table_name | jq '.Table | del(.TableId, .TableArn, .ItemCount, .TableSizeBytes, .CreationDateTime, .TableStatus, .LatestStreamArn, .LatestStreamLabel, .ProvisionedThroughput.NumberOfDecreasesToday, .ProvisionedThroughput.LastIncreaseDateTime)' > schema.json

aws dynamodb delete-table --table-name $table_name

aws dynamodb create-table --cli-input-json file://schema.json

Upvotes: 14

Related Questions