user2061811
user2061811

Reputation: 315

Pagination in DynamoDB using Node.js?

I've had a read through AWS's docs around pagination:

As their docs specify:

In a response, DynamoDB returns all the matching results within the scope of the Limit value. For example, if you issue a Query or a Scan request with a Limit value of 6 and without a filter expression, DynamoDB returns the first six items in the table that match the specified key conditions in the request (or just the first six items in the case of a Scan with no filter)

Which means that given I have a table called Questions with an attribute called difficulty(that can take any numeric value ranging from 0 to 2) I might end up with the following conundrum:

How can I then paginate based on a query correctly? Something where I'll get as many results as I asked for whilst having the correct offset

Upvotes: 28

Views: 55270

Answers (12)

Shivam
Shivam

Reputation: 851

Here is a complete example of how to use the paginateQuery utility from @aws-sdk/lib-dynamodb.

import {
  DynamoDBClient,
  QueryCommandInput,
  QueryCommandOutput,
  paginateQuery,
} from "@aws-sdk/client-dynamodb";

const dbClient = new DynamoDBClient({ region: "us-east-1" });

const items: QueryCommandOutput[] = [];

const query: QueryCommandInput = {
  TableName: VIDEOS_TABLE_NAME,
  IndexName: "GSI1",
  KeyConditionExpression: "GSI1PK = :videoId",
  ExpressionAttributeValues: {
    ":videoId": { S: "VIDEO#123" },
  },
};

const pager = paginateQuery({ client: dbClient }, query);

for await (const item of pager) {
  items.push(item);
}

// all the output from query
console.log(items);

Upvotes: 1

Jose Parra
Jose Parra

Reputation: 1

example in typescript using recursion

import { QueryCommandInput } from "@aws-sdk/client-dynamodb/dist-types/commands/QueryCommand";
import { DynamoDBClient, QueryCommand } from "@aws-sdk/client-dynamodb";

export const getAllDataDynamodb = async (params: QueryCommandInput) => {
    const dynamodbClient = new DynamoDBClient({ region: process.env.AWS_REGION });
    const command = new QueryCommand(params);
    let data: any[] = [];
    const results = await dynamodbClient.send(command);
    if (results.Items.length) {
        data = results.Items;
    }
    if (results.LastEvaluatedKey) {
        console.log("Scanning for more...", results);
        params.ExclusiveStartKey = results.LastEvaluatedKey;
        data = [...data, ...(await getAllDataDynamodb(params))];
    }
    return data;
}

Upvotes: 0

Socratic Programmer
Socratic Programmer

Reputation: 76

Using DynamoDB pagination with async generators:

let items = []
let params = {
    TableName: 'mytable',
    Limit: 1000,
    KeyConditionExpression: 'mykey = :key',
    ExpressionAttributeValues: {
      ':key': { S: 'myvalue' },
    },
}

async function* fetchData({
    params
  }) {
    let data
    do {
      data = await dynamodb.query(params).promise()
      yield data.Items
      params.ExclusiveStartKey = data.LastEvaluatedKey
    } while (typeof data.LastEvaluatedKey != 'undefined')
  }

for await (const data of fetchData(params)) {
    items = [...items, ...data]
}

Upvotes: 0

Anton
Anton

Reputation: 970

I hope you figured out. So just in case others might find it useful. AWS has QueryPaginator/ScanPaginator as simple as below:

const paginator = new QueryPaginator(dynamoDb, queryInput);

for await (const page of paginator) {
    // do something with the first page of results
    break
}

See more details at https://github.com/awslabs/dynamodb-data-mapper-js/tree/master/packages/dynamodb-query-iterator

2022-05-19: For AWS SDK v3 see how to use paginateXXXX at this blog post https://aws.amazon.com/blogs/developer/pagination-using-async-iterators-in-modular-aws-sdk-for-javascript/

Upvotes: 11

mim
mim

Reputation: 1417

You can also achieve this using recrusion instead of a global variable, like:

const getAllData = async (params, allData = []) => {
    let data = await db.scan(params).promise();
    return (data.LastEvaluatedKey) ?
      getAllData({...params, ExclusiveStartKey: data.LastEvaluatedKey}, [...allData, ...data['Items']]) :
      [...allData, ...data['Items']];
};

Then you can simply call it like:

 let test = await getAllData({ "TableName": "test-table"}); // feel free to add try/catch

Upvotes: 0

ben_lize
ben_lize

Reputation: 107

Using async/await, returning the data in await. Elaboration on @Roshan Khandelwal's answer.

const getAllData = async (params, allData = []) => {
  const data = await dynamodbDocClient.scan(params).promise()

  if (data['Items'].length > 0) {
    allData = [...allData, ...data['Items']]
  }

  if (data.LastEvaluatedKey) {
    params.ExclusiveStartKey = data.LastEvaluatedKey
    return await getAllData(params, allData)
  } else {
    return allData
  }
}

Call inside a try/catch:

try {
        const data = await getAllData(params);
        console.log("my data: ", data);
    } catch(error) {
        console.log(error);
    }

Upvotes: 3

siutsin
siutsin

Reputation: 1624

Avoid using recursion to prevent call stack overflow. An iterative solution extending @Roshan Khandelwal's approach:

const getAllData = async (params) => {
  const _getAllData = async (params, startKey) => {
    if (startKey) {
      params.ExclusiveStartKey = startKey
    }
    return this.documentClient.query(params).promise()
  }
  let lastEvaluatedKey = null
  let rows = []
  do {
    const result = await _getAllData(params, lastEvaluatedKey)
    rows = rows.concat(result.Items)
    lastEvaluatedKey = result.LastEvaluatedKey
  } while (lastEvaluatedKey)
  return rows
}

Upvotes: 17

Sukalyan Debsingha
Sukalyan Debsingha

Reputation: 356

For create pagination in dynamodb scan like

var params = {  
        "TableName"                 : "abcd",
        "FilterExpression"          : "#someexperssion=:someexperssion",
        "ExpressionAttributeNames"  : {"#someexperssion":"someexperssion"},
        "ExpressionAttributeValues" : {":someexperssion" : "value"},
        "Limit"                     : 20,
        "ExclusiveStartKey"         : {"id": "9ee10f6e-ce6d-4820-9fcd-cabb0d93e8da"}
    };
DB.scan(params).promise();

where ExclusiveStartKey is LastEvaluatedKey return by this query last execution time

Upvotes: 3

Roshan Khandelwal
Roshan Khandelwal

Reputation: 973

Using async/await.

const getAllData = async (params) => { 

    console.log("Querying Table");
    let data = await docClient.query(params).promise();

    if(data['Items'].length > 0) {
        allData = [...allData, ...data['Items']];
    }

    if (data.LastEvaluatedKey) {
        params.ExclusiveStartKey = data.LastEvaluatedKey;
        return await getAllData(params);

    } else {
        return data;
    }
}

I am using a global variable allData to collect all the data.

Calling this function is enclosed within a try-catch

try {

        await getAllData(params);
        console.log("Processing Completed");

        // console.log(allData);

    } catch(error) {
        console.log(error);
    }

I am using this from within a Lambda and it works fine.

The article here really helped and guided me. Thanks.

Upvotes: 32

DonMedardo
DonMedardo

Reputation: 31

you can do a index secundary by difficulty and at query set KeyConditionExpression where difficulty = 0. Like this

var params = {
    TableName: questions,
    IndexName: 'difficulty-index',
    KeyConditionExpression: 'difficulty = :difficulty ',
    ExpressionAttributeValues: {':difficulty':0}
}

Upvotes: 0

andrhamm
andrhamm

Reputation: 4411

Here is an example of how to iterate over a paginated result set from a DynamoDB scan (can be easily adapted for query as well) in Node.js.

You could save the LastEvaluatedKey state serverside and pass an identifier back to your client, which it would send with its next request and your server would pass that value as ExclusiveStartKey in the next request to DynamoDB.

const AWS = require('aws-sdk');
AWS.config.logger = console;

const dynamodb = new AWS.DynamoDB({ apiVersion: '2012-08-10' });

let val = 'some value';

let params = {
  TableName: "MyTable",
  ExpressionAttributeValues: {
    ':val': {
      S: val,
    },
  },
  Limit: 1000,
  FilterExpression: 'MyAttribute = :val',
  // ExclusiveStartKey: thisUsersScans[someRequestParamScanID]
};

dynamodb.scan(scanParams, function scanUntilDone(err, data) {
  if (err) {
    console.log(err, err.stack);
  } else {
    // do something with data

    if (data.LastEvaluatedKey) {
      params.ExclusiveStartKey = data.LastEvaluatedKey;

      dynamodb.scan(params, scanUntilDone);
    } else {
      // all results scanned. done!
      someCallback();
    }
  }
});

Upvotes: 20

Alexander Patrikalakis
Alexander Patrikalakis

Reputation: 5205

Query and Scan operations return LastEvaluatedKey in their responses. Absent concurrent insertions, you will not miss items nor will you encounter items multiple times, as long as you iterate calls to Query/Scan and set ExclusiveStartKey to the LastEvaluatedKey of the previous call.

Upvotes: 5

Related Questions