Reputation: 42113
I'm looking to store a large collection of photos in a DynamoDB table. Each photo can belong to an an "album" -- in fact, a photo can belong to multiple albums. I'd like to set the data up so I can perform a query of the album_id and retrive all the photo_ids that belong to that album.
For example: "get me all the photos that belong to album 1"
table "album-photo-map"
keys(album_id, timestamp) - photo_id
I could then perform a range query on on the table album-photo-map asking for all photo_ids that belong to album "1" with a range_key of timestamp greater than 0.
The problem is -- what if there are two photos with the same timestamp? DynamoDB won't let me have multiple items with the same key.
One way around this might be to store a binary list of photo_ids in one of the data fields for the album_id, but then the list of photos becomes limited by 64K which I'd rather not do.
Am I thinking about this correctly? Is there a solution to the duplicate timestamp problem? Perhaps I could do something like:
timestamp = str(time.time()).replace('.','')
>> 134704419008
and store that? Would that be fast enough to eliminate the duplicate problem?
Upvotes: 1
Views: 577
Reputation: 7152
You can use a hash of the image as the range_key. If the hash function is well chosen, there will be very little chance that images overlaps while not being identical. This key would be even better as it is directly related to the content.
If performance matters, you may simply append a random number to the key.
Upvotes: 1