Best way to structure DynamoDB table?

Question

I'm making a website in which users pull images and add annotation to them and I'm wrestling with the most efficient way to structure the table. Consider the following:

Users must not see the same image twice, so I need to manipulate some item that way
There will be about 1000 images listed in the table
Unknown number of users, but I doubt I'd be hitting the 400kb item size limit
At some point, I'd like to gamify the metadata so users can compare their metadata with others

I'm guessing img id and user id as partition and sort key are the best choices, although that leaves 1000 items per user and when new image are added, I would need to add an item for every user - which I could probably do easily enough with a secondary index. I'd like to avoid scans entirely, if possible.

Ben · Accepted Answer

If you want a single table, you might consider two types of items in this table:

1. (Unannotated) Image

Partition key: imgID_xxx

Range key: img

2. Annotated image by user

Partition key: userID_xxx

Range key: imgID_xxx

Annotation: some annotation...

So initially you'll only have your 1000 unannotated image items, which users can query via the GSI (hashKey is what I'm calling the partition key here):

hashKey  | rangeKey | isImg | ...
img_0001 | img      | 1     | 
img_0002 | img      | 1     |
...
img_1000 | img      | 1     |

When any user downloads any image they'll get this common one to start with, the "Annotated image by user" items are only generated lazily after a user annotates an image.

If a user wants to annotate an image, you will need to write to the "Annotated image" item, which will be partitioned by userID but should also have a GSI on the imgID.

For example if user_111 annotated two images (img_0002 and img_0042) and then user_222 annotated just one image (img_0002):

hashKey  | rangeKey | isImg | annotation | imgID    |
img_0001 | img      | 1     | 
img_0002 | img      | 1     |
...
img_1000 | img      | 1     |
user_111 | img_0002 |       | "foo"      | img_0002 |
user_111 | img_0042 |       | "bar"      | img_0042 |
user_222 | img_0002 |       | "baz"      | img_0002 |

This will allow a user to:

Query all images (via the first GSI): 1000 items returned
Query all images they have annotated (they sit in a single userID partition)
Query all the annotations that were made on a single image (via the second GSI) i.e. in this case it would return one item for img_0042, or two items for img_0002.

When adding a new image, only a single item would need to be added. Only once a user annotates that image will you need to create the extra item for that user as well.

Best way to structure DynamoDB table?

Answers (1)

1. (Unannotated) Image

2. Annotated image by user

Related Questions