Reputation: 315
I've had a read through AWS's docs around pagination:
As their docs specify:
In a response, DynamoDB returns all the matching results within the scope of the Limit value. For example, if you issue a Query or a Scan request with a Limit value of 6 and without a filter expression, DynamoDB returns the first six items in the table that match the specified key conditions in the request (or just the first six items in the case of a Scan with no filter)
Which means that given I have a table called Questions
with an attribute called difficulty
(that can take any numeric value ranging from 0
to 2
) I might end up with the following conundrum:
GET /questions?difficulty=0&limit=3
3
to the DynamoDB query, which might return 0
items as the first 3 in the collection might not be of difficulty == 0
questions
that match that criteria without knowing I might return duplicatesHow can I then paginate based on a query correctly? Something where I'll get as many results as I asked for whilst having the correct offset
Upvotes: 28
Views: 55270
Reputation: 851
Here is a complete example of how to use the paginateQuery
utility from @aws-sdk/lib-dynamodb.
import {
DynamoDBClient,
QueryCommandInput,
QueryCommandOutput,
paginateQuery,
} from "@aws-sdk/client-dynamodb";
const dbClient = new DynamoDBClient({ region: "us-east-1" });
const items: QueryCommandOutput[] = [];
const query: QueryCommandInput = {
TableName: VIDEOS_TABLE_NAME,
IndexName: "GSI1",
KeyConditionExpression: "GSI1PK = :videoId",
ExpressionAttributeValues: {
":videoId": { S: "VIDEO#123" },
},
};
const pager = paginateQuery({ client: dbClient }, query);
for await (const item of pager) {
items.push(item);
}
// all the output from query
console.log(items);
Upvotes: 1
Reputation: 1
example in typescript using recursion
import { QueryCommandInput } from "@aws-sdk/client-dynamodb/dist-types/commands/QueryCommand";
import { DynamoDBClient, QueryCommand } from "@aws-sdk/client-dynamodb";
export const getAllDataDynamodb = async (params: QueryCommandInput) => {
const dynamodbClient = new DynamoDBClient({ region: process.env.AWS_REGION });
const command = new QueryCommand(params);
let data: any[] = [];
const results = await dynamodbClient.send(command);
if (results.Items.length) {
data = results.Items;
}
if (results.LastEvaluatedKey) {
console.log("Scanning for more...", results);
params.ExclusiveStartKey = results.LastEvaluatedKey;
data = [...data, ...(await getAllDataDynamodb(params))];
}
return data;
}
Upvotes: 0
Reputation: 76
Using DynamoDB pagination with async generators:
let items = []
let params = {
TableName: 'mytable',
Limit: 1000,
KeyConditionExpression: 'mykey = :key',
ExpressionAttributeValues: {
':key': { S: 'myvalue' },
},
}
async function* fetchData({
params
}) {
let data
do {
data = await dynamodb.query(params).promise()
yield data.Items
params.ExclusiveStartKey = data.LastEvaluatedKey
} while (typeof data.LastEvaluatedKey != 'undefined')
}
for await (const data of fetchData(params)) {
items = [...items, ...data]
}
Upvotes: 0
Reputation: 970
I hope you figured out. So just in case others might find it useful. AWS has QueryPaginator/ScanPaginator as simple as below:
const paginator = new QueryPaginator(dynamoDb, queryInput);
for await (const page of paginator) {
// do something with the first page of results
break
}
See more details at https://github.com/awslabs/dynamodb-data-mapper-js/tree/master/packages/dynamodb-query-iterator
2022-05-19:
For AWS SDK v3 see how to use paginateXXXX
at this blog post https://aws.amazon.com/blogs/developer/pagination-using-async-iterators-in-modular-aws-sdk-for-javascript/
Upvotes: 11
Reputation: 1417
You can also achieve this using recrusion instead of a global variable, like:
const getAllData = async (params, allData = []) => {
let data = await db.scan(params).promise();
return (data.LastEvaluatedKey) ?
getAllData({...params, ExclusiveStartKey: data.LastEvaluatedKey}, [...allData, ...data['Items']]) :
[...allData, ...data['Items']];
};
Then you can simply call it like:
let test = await getAllData({ "TableName": "test-table"}); // feel free to add try/catch
Upvotes: 0
Reputation: 107
Using async/await, returning the data in await. Elaboration on @Roshan Khandelwal's answer.
const getAllData = async (params, allData = []) => {
const data = await dynamodbDocClient.scan(params).promise()
if (data['Items'].length > 0) {
allData = [...allData, ...data['Items']]
}
if (data.LastEvaluatedKey) {
params.ExclusiveStartKey = data.LastEvaluatedKey
return await getAllData(params, allData)
} else {
return allData
}
}
Call inside a try/catch:
try {
const data = await getAllData(params);
console.log("my data: ", data);
} catch(error) {
console.log(error);
}
Upvotes: 3
Reputation: 1624
Avoid using recursion to prevent call stack overflow. An iterative solution extending @Roshan Khandelwal's approach:
const getAllData = async (params) => {
const _getAllData = async (params, startKey) => {
if (startKey) {
params.ExclusiveStartKey = startKey
}
return this.documentClient.query(params).promise()
}
let lastEvaluatedKey = null
let rows = []
do {
const result = await _getAllData(params, lastEvaluatedKey)
rows = rows.concat(result.Items)
lastEvaluatedKey = result.LastEvaluatedKey
} while (lastEvaluatedKey)
return rows
}
Upvotes: 17
Reputation: 356
For create pagination in dynamodb scan like
var params = {
"TableName" : "abcd",
"FilterExpression" : "#someexperssion=:someexperssion",
"ExpressionAttributeNames" : {"#someexperssion":"someexperssion"},
"ExpressionAttributeValues" : {":someexperssion" : "value"},
"Limit" : 20,
"ExclusiveStartKey" : {"id": "9ee10f6e-ce6d-4820-9fcd-cabb0d93e8da"}
};
DB.scan(params).promise();
where ExclusiveStartKey is LastEvaluatedKey return by this query last execution time
Upvotes: 3
Reputation: 973
Using async/await.
const getAllData = async (params) => {
console.log("Querying Table");
let data = await docClient.query(params).promise();
if(data['Items'].length > 0) {
allData = [...allData, ...data['Items']];
}
if (data.LastEvaluatedKey) {
params.ExclusiveStartKey = data.LastEvaluatedKey;
return await getAllData(params);
} else {
return data;
}
}
I am using a global variable allData to collect all the data.
Calling this function is enclosed within a try-catch
try {
await getAllData(params);
console.log("Processing Completed");
// console.log(allData);
} catch(error) {
console.log(error);
}
I am using this from within a Lambda and it works fine.
The article here really helped and guided me. Thanks.
Upvotes: 32
Reputation: 31
you can do a index secundary by difficulty and at query set KeyConditionExpression where difficulty = 0. Like this
var params = {
TableName: questions,
IndexName: 'difficulty-index',
KeyConditionExpression: 'difficulty = :difficulty ',
ExpressionAttributeValues: {':difficulty':0}
}
Upvotes: 0
Reputation: 4411
Here is an example of how to iterate over a paginated result set from
a DynamoDB scan
(can be easily adapted for query
as well) in Node.js.
You could save the LastEvaluatedKey
state serverside and pass an identifier back to your client, which it would send with its next request and your server would pass that value as ExclusiveStartKey
in the next request to DynamoDB.
const AWS = require('aws-sdk');
AWS.config.logger = console;
const dynamodb = new AWS.DynamoDB({ apiVersion: '2012-08-10' });
let val = 'some value';
let params = {
TableName: "MyTable",
ExpressionAttributeValues: {
':val': {
S: val,
},
},
Limit: 1000,
FilterExpression: 'MyAttribute = :val',
// ExclusiveStartKey: thisUsersScans[someRequestParamScanID]
};
dynamodb.scan(scanParams, function scanUntilDone(err, data) {
if (err) {
console.log(err, err.stack);
} else {
// do something with data
if (data.LastEvaluatedKey) {
params.ExclusiveStartKey = data.LastEvaluatedKey;
dynamodb.scan(params, scanUntilDone);
} else {
// all results scanned. done!
someCallback();
}
}
});
Upvotes: 20
Reputation: 5205
Query and Scan operations return LastEvaluatedKey
in their responses. Absent concurrent insertions, you will not miss items nor will you encounter items multiple times, as long as you iterate calls to Query/Scan and set ExclusiveStartKey to the LastEvaluatedKey of the previous call.
Upvotes: 5