Scanning on GSI Vs Scanning the entire table in DynamoDB

Question

I have following table in DynamoDB. The ID is a partition key and Category is sort key. The IDs-1 and ID-2 are GSIs. The values in IDs-1 and IDs-2 are in the form of string, like, "list1,list2". I have a situation wherein I have to search in IDs-1 and IDs-2 column. For example, I want to see if list7 is present in any of the columns.

In this case,

ID[Number]    Category[String]     IDs-1[String]            IDs-2[String] 
1             category1            list1, list2           
2             category2                                     list7, list8     
3             category1            list3, list4     
4             category2                                     list5, list6

I will have around 10K entries in this table in total.

What is the difference between scanning on GSI and scanning the entire table in DynamoDB?

Thanks

Chris Williams · Accepted Answer

Scanning on both will still have the same cost in terms of RCU (read credit units) if the entire data schema is the same. A GSI will have its own credits so these will be deducted from its pool.

Looking at your data I can see some values are missing the attribute, which according to the documentation means they will not be included. For this reason the scan would be slightly cheaper as there is less data in the GSI

A global secondary index only tracks data items where its key attributes actually exist. For example, suppose that you added another new item to the GameScores table, but only provided the required primary key attributes.

Additionally if less attributes are projected, this might affected the cost (1 RCU is equal to either 1 strongly consistent read or 2 eventually consistent reads, for 4KB of an item) so if your items size changes below 4kb by having less attributes in your GSI you will pay less.

Scanning on GSI Vs Scanning the entire table in DynamoDB

Answers (1)

Related Questions