Reputation: 8810
I'm recently looking into the new NoSQL service that Amazon provides, more specifically the DynamoDB.
Amazon says you should avoid using unevenly distributed keys as primary key, namely the primary keys should be the more unique the better. Can I see this as having unique primary key for every item is the best case? How about having some items with duplicated keys?
I want to know how the underlying mechanism works so I know how bad it can be.
Upvotes: 2
Views: 3183
Reputation: 6913
Tables are partitioned across multiple machines based on the hash key, so the more random they are the better. In my app I use company_id for the hash, then a unique id for the range, that way my tables can be distributed reasonably evenly.
What they are trying to avoid is you using the same hash key for the majority of your data, the more random they are the easier it is for Dynamo to keep your data coming back to you quickly.
Upvotes: 4