ZZzzZZzz
ZZzzZZzz

Reputation: 1844

Querying dynamodb on index which has 1 Million records

Is there a way at all to query on the global secondary index of a dynamodb table to fetch, say, 1000 records at a time and return the next set on the next query. I have a java application which queries the table and fetches all the records associated to it and that is causing heap space error on my EC2 instance. Is there a way to parallelize dynamodb query operation like Kinesis checkpointer. Below is how I am querying.

    DynamoDBQueryExpression<RecordsTest> queryExpression = new DynamoDBQueryExpression<RecordsTest>()
                .withHashKeyValues(test).withConsistentRead(false);
        List<RecordsTest> test  = mapper.query(RecordsTest.class, queryExpression); for (RecordsTest tst : test) {
            System.out.println(" record not found");}

Also, I have tried using the QuerySpec option but this returns the same set of elements from the table when I specify the limit of items to return. i want the returned items to be those which were not returned earlier.

Upvotes: 0

Views: 2872

Answers (1)

Alexander Patrikalakis
Alexander Patrikalakis

Reputation: 5205

Query API performs sequential reads on DynamoDB partitions, starting at the partition key you provided in KeyConditions. If you use sharding with prefixes on the partition key as part of your schema, you can run query API in parallel on each shard of a partition key. You need to set ExclusiveStartKey on subsequent Query calls using the LastEvaluatedKey of the previous call, if you want to avoid returning the same results.

Upvotes: 1

Related Questions