Moving MySQL table to AWS DynamoDB - how to set it up?

Question

I have this table on a RDS instance:

+--------------+---------------------+------+-----+---------+----------------+
| Field        | Type                | Null | Key | Default | Extra          |
+--------------+---------------------+------+-----+---------+----------------+
| id           | bigint(20)          | NO   | PRI | NULL    | auto_increment |
| match_id     | bigint(10) unsigned | NO   | MUL | NULL    |                |
| prop_type_id | bigint(10) unsigned | NO   |     | NULL    |                |
| title        | varchar(45)         | NO   | MUL | NULL    |                |
| odds         | double              | YES  |     | NULL    |                |
| status       | tinyint(4) unsigned | YES  |     | 1       |                |
| selection_id | bigint(15)          | YES  | MUL | NULL    |                |
| market_id    | bigint(15)          | YES  | MUL | NULL    |                |
| date_time    | datetime            | NO   |     | NULL    |                |
| available    | int(11)             | NO   |     | NULL    |                |
| source       | tinyint(4)          | YES  |     | NULL    |                |
+--------------+---------------------+------+-----+---------+----------------+

The best setup we found for indexes is to set them on match_id, prop_type_id, selection_id and market_id.

At this time the DB is around 1.5 Gb in size and we get anywhere between 100 and 500 queries per second against this table and soon this will go much higher. About 75% of these are selects, the others are updates and deletes. The data is quite volatile. This results in hangs with MyISAM and lots of deadlocks with InnoDB.

I've already tried SimpleDB and aside from taking about 18 hours to mod our Rails code to work with it, now a simple select takes anywhere between 1 and 6 seconds and its not always consistent. I had to memcache it heavily but that is a major disadvantage - we want the data to update in the DB and also on screen at most every 8 seconds or so.

Would like to move this on DynamoDB which hopefully can perform much better in high-traffic scenarios.

Now the challenge is that we need to search based on the following: - id (this can be anything as long as they're unique but we don't always use it when querying) - match_id - prop_type_id - title - selection_id - market_id - status (0..2)

In general we query either on match_id and prop_type_id, or on match_id, prop_type_id, market_id and selection_id. Queries based on title are rare but they can't be avoided. Same for status.

Is there a way we could model this for DynamoDB ?

I know they support only 5 indexes in addition and they also require the ID to be known at all times. How am I supposed to know the IDs without first looking up the data, I don't know, but maybe I'm doing the whole thing wrong ?

Thanks!

w xiao · Accepted Answer

Local secondary index does not fit your query pattern, as you have noticed.

At this time, you will need to build your own index using multiple DynamoDB tables, which means that your application needs to write to both the main table, as well as the table which acts as an index onto your main table. For example, your main table may have "id" as the hash key, and your index table may have "match_id" as the hash key, and "id" as the range key.

Moving MySQL table to AWS DynamoDB - how to set it up?

Answers (2)

Related Questions