Reputation: 2189
I'm trying to delete all items in a DynamoDB table that has both partition and sort keys using AWS CLI in bash. The best thing I've found so far is:
aws dynamodb scan --table-name $TABLE_NAME --attributes-to-get "$KEY" \
--query "Items[].$KEY.S" --output text | \
tr "\t" "\n" | \
xargs -t -I keyvalue aws dynamodb delete-item --table-name $TABLE_NAME \
--key "{\"$KEY\": {\"S\": \"keyvalue\"}}"
But this does not work with a table that has both the partition key and the sort key, and I have not yet been able to make it work with such a table. Any idea how to modify the script to make it work for a table with composite keys?
Upvotes: 19
Views: 68015
Reputation: 5867
Depending on the size of your table this can be too expensive and result in downtime. Remember that deletes cost you the same as a write, so you'll get throttled by your provisioned WCU. It would be much simpler and faster to just delete and recreate the table.
# this uses jq but basically we're just removing
# some of the json fields that describe an existing
# ddb table and are not actually part of the table schema/defintion
aws dynamodb describe-table --table-name $table_name | jq '.Table | del(.TableId, .TableArn, .ItemCount, .TableSizeBytes, .CreationDateTime, .TableStatus, .ProvisionedThroughput.NumberOfDecreasesToday)' > schema.json
# delete the table
aws dynamodb delete-table --table-name $table_name
# create table with same schema (including name and provisioned capacity)
aws dynamodb create-table --cli-input-json file://schema.json
If you really want to you can delete each item individually and you're on the right track you just need to specify both the hash and range keys in your scan projection and delete command.
aws dynamodb scan \
--attributes-to-get $HASH_KEY $RANGE_KEY \
--table-name $TABLE_NAME --query "Items[*]" \
# use jq to get each item on its own line
| jq --compact-output '.[]' \
# replace newlines with null terminated so
# we can tell xargs to ignore special characters
| tr '\n' '\0' \
| xargs -0 -t -I keyItem \
# use the whole item as the key to delete (dynamo keys *are* dynamo items)
aws dynamodb delete-item --table-name $TABLE_NAME --key=keyItem
If you want to get super fancy you can use the describe-table
call to fetch the hash and range key to populate $HASH_KEY
and $RANGE_KEY
but i'll leave that as an exercise for you.
Upvotes: 39
Reputation: 51
If you like to achieve that w/o writing specific scripts, for 1 time semi-manual (1 clicked for auto pagination deletion) and if you have permission to manage that from the was console.
you can use this Chrome Extension: https://chromewebstore.google.com/detail/dynamodb-auto-delete/oiicbadbamlhijkjildbhamionnbdaia?authuser=0&hl=en&pli=1
Upvotes: 0
Reputation: 21
I've used some examples here and created an code which actually take the parameters, delete, and re-create the table... working fine:
TABLE_NAME='<your_table_name>' ;\
aws dynamodb describe-table --table-name $TABLE_NAME \
|jq '.Table + .Table.BillingModeSummary
|del(.TableId, .TableArn, .ItemCount, .TableSizeBytes, .CreationDateTime,
.TableStatus, .ProvisionedThroughput, .BillingModeSummary,
.LastUpdateToPayPerRequestDateTime, .GlobalSecondaryIndexes[].IndexStatus,
.GlobalSecondaryIndexes[].IndexSizeBytes,
.GlobalSecondaryIndexes[].ItemCount, .GlobalSecondaryIndexes[].IndexArn,
.GlobalSecondaryIndexes[].ProvisionedThroughput)' > tb_schema.json
aws dynamodb delete-table --table-name $TABLE_NAME
aws dynamodb create-table --cli-input-json file://tb_schema.json
Upvotes: 0
Reputation: 47
I've created a node module to do this:
https://www.npmjs.com/package/dynamodb-empty
yarn global add dynamodb-empty
dynamodb-empty --table tableName
Upvotes: 3
Reputation: 3913
From the @Adel and @codeperson answers here I made a function using Amplify CLI (With Hello World template), where the table name has to be passed using the event object:
/* Amplify Params - DO NOT EDIT
API_DEALSPOON_GRAPHQLAPIENDPOINTOUTPUT
API_DEALSPOON_GRAPHQLAPIIDOUTPUT
Amplify Params - DO NOT EDIT */
const AWS = require('aws-sdk')
const environment = process.env.ENV
const region = process.env.REGION
const apiDealspoonGraphQLAPIIdOutput = process.env.API_DEALSPOON_GRAPHQLAPIIDOUTPUT
exports.handler = async (event) => {
const DynamoDb = new AWS.DynamoDB.DocumentClient({region});
// const tableName = "dev-invite";
// const hashKey = "InviteToken";
let {tableName, hashKey} = event
tableName = `${tableName}-${apiDealspoonGraphQLAPIIdOutput}'-'${environment}`
// Customization 4: add logic to determine which (return true if you want to delete the respective item)
// If you don't want to filter anything out, then just return true in this function (or remove the filter step below, where this filter is used)
const shouldDeleteItem = (item) => {
return item.Type === "SECURE_MESSAGE" || item.Type === "PATIENT";
};
const getAllItemsFromTable = async (lastEvaluatedKey) => {
const res = await DynamoDb.scan({
TableName: tableName,
ExclusiveStartKey: lastEvaluatedKey
}).promise();
return {items: res.Items, lastEvaluatedKey: res.LastEvaluatedKey};
};
const deleteAllItemsFromTable = async (items) => {
let numItemsDeleted = 0;
// Split items into patches of 25
// 25 items is max for batchWrite
await asyncForEach(split(items, 25), async (patch, i) => {
const requestItems = {
[tableName]: patch.filter(shouldDeleteItem).map(item => {
numItemsDeleted++;
return {
DeleteRequest: {
Key: {
[hashKey]: item[hashKey]
}
}
};
})
};
if (requestItems[tableName].length > 0) {
await DynamoDb.batchWrite({RequestItems: requestItems}).promise();
console.log(`finished deleting ${numItemsDeleted} items this batch`);
}
});
return {numItemsDeleted};
};
function split(arr, n) {
const res = [];
while (arr.length) {
res.push(arr.splice(0, n));
}
return res;
}
async function asyncForEach(array, callback) {
for (let index = 0; index < array.length; index++) {
await callback(array[index], index, array);
}
}
let lastEvaluatedKey;
let totalItemsFetched = 0;
let totalItemsDeleted = 0;
console.log(`------ Deleting from table ${tableName}`);
do {
const {items, lastEvaluatedKey: lek} = await getAllItemsFromTable(lastEvaluatedKey);
totalItemsFetched += items.length;
console.log(`--- a group of ${items.length} was fetched`);
const {numItemsDeleted} = await deleteAllItemsFromTable(items);
totalItemsDeleted += numItemsDeleted;
console.log(`--- ${numItemsDeleted} items deleted`);
lastEvaluatedKey = lek;
} while (!!lastEvaluatedKey);
console.log("Done!");
console.log(`${totalItemsFetched} items total fetched`);
console.log(`${totalItemsDeleted} items total deleted`);
};
Upvotes: 0
Reputation: 191
If you are interested in doing it with Node.js, have a look at this example (I'm using TypeScript here). Further related infos can be found in the AWS docs.
import AWS from 'aws-sdk';
const DynamoDb = new AWS.DynamoDB.DocumentClient({
region: 'eu-west-1'
});
export const getAllItemsFromTable = async TableName => {
const Res = await DynamoDb.scan({ TableName }).promise();
return Res.Items;
};
export const deleteAllItemsFromTable = async (TableName = '', items:{ id: string }, hashKey) => {
var counter = 0;
//split items into patches of 25
// 25 items is max for batchWrite
asyncForEach(split(items, 25), async (patch, i) => {
const RequestItems = {
TableName: patch.map(item => {
return {
DeleteRequest: {
Key: {
id: item.id
}
}
};
})
};
await DynamoDb.batchWrite({ RequestItems }).promise();
counter += patch.length;
console.log('counter : ', counter);
});
};
function split(arr, n) {
var res = [];
while (arr.length) {
res.push(arr.splice(0, n));
}
return res;
}
async function asyncForEach(array, callback) {
for (let index = 0; index < array.length; index++) {
await callback(array[index], index, array);
}
}
const tableName = "table"
// assuming table hashKey is named "id"
deleteAllItemsFromTable(tableName,getAllItemsFromTable(tableName))
Upvotes: 7
Reputation: 9
We had some tables with indexes, so there had to be deleted some fields more, additionally the ".ProvisionedThroughput.LastDecreaseDateTime". Was a little work to fiddle out as i'm totally new to jq ;-) But this is how it worked for us:
aws dynamodb describe-table --table-name $table_name | jq '.Table | del(.TableId, .TableArn, .ItemCount, .TableSizeBytes, .CreationDateTime, .TableStatus, .LatestStreamArn, .LatestStreamLabel, .ProvisionedThroughput.NumberOfDecreasesToday, .ProvisionedThroughput.LastIncreaseDateTime, .ProvisionedThroughput.LastDecreaseDateTime, .GlobalSecondaryIndexes[].IndexSizeBytes, .GlobalSecondaryIndexes[].ProvisionedThroughput.NumberOfDecreasesToday, .GlobalSecondaryIndexes[].IndexStatus, .GlobalSecondaryIndexes[].IndexArn, .GlobalSecondaryIndexes[].ItemCount)' > schema.json
Upvotes: 0
Reputation: 415
TO correct what @Cheruvian has posted. The following commands work, there are few more fields we need to exclude while creating schema.json.
aws dynamodb describe-table --table-name $table_name | jq '.Table | del(.TableId, .TableArn, .ItemCount, .TableSizeBytes, .CreationDateTime, .TableStatus, .LatestStreamArn, .LatestStreamLabel, .ProvisionedThroughput.NumberOfDecreasesToday, .ProvisionedThroughput.LastIncreaseDateTime)' > schema.json
aws dynamodb delete-table --table-name $table_name
aws dynamodb create-table --cli-input-json file://schema.json
Upvotes: 14