purushottam
purushottam

Reputation: 75

Is there is a chance to overcome the scan data of 1 mb in parallel asynchronous in dynamodb

When I am scanning the data, it is having a limit of 1MB for 1 segment. To display all the data in one segment is possible in asynchronous parallel in dynamodb.

Upvotes: 1

Views: 406

Answers (1)

Alexander Patrikalakis
Alexander Patrikalakis

Reputation: 5195

When doing a parallel scan, each Scan API call will return at most 1MB of data. So, if you increase the number of segments enough, you will be able to get all of your data in one trip. According to the documentation, the maximum number of segments is 1 million. That means that as long as you have enough provisioned read throughput on your table, and the size of your table is less than 976GB, reading the entire table in one round trip is possible. Each 1MB page will incur 128 RCU if ConsistentRead=false. Therefore, if each partition supports up to 3000 Reads per second, each partition can support reading up to 23 segments in parallel. Dividing 1 million segments by 23 segments per partition yields 43479 partitions required to support a simultaneous read of 976GB.

To create a table with 43479 partitions or more, find the next largest power of 2. In this case, the next largest power of 2 to 43479 is 2^16=65536. Provision a table with 65536*750 = 49152000 WPS and 49152000 RPS to create it with 65536 partitions. The instantaneous RCU consumption of this parallel scan will be 128 * 1000000 = 128 million RCU, so you would need to re-provision your table at 128 million RPS before performing the parallel scan.

For any of this to work, you will need to request increases to your account-level and table level provisioned capacity limits. Otherwise, in us-east-1, default table provisioning limits are 40k rps and 40k wps per table. If you provision a table at 40k reads per second, you can read a maximum of 312 segments in parallel, or 312 MB.

Upvotes: 1

Related Questions