Reputation: 1085
I wants to store the visits log of 10 websites, receiving a total of 10M visits a month in DynamoDB. After that I want to create a back-end to monitor and detect fraud operations.
Can DynamoDB handle complex queries like: - List of visitors with a X bounce rate in a specified Interval - Popular Destination URI between date/time & date/time - Ordering & Grouping By
Upvotes: 0
Views: 1598
Reputation: 739
- List of visitors with a X bounce rate in a specified Interval
DynamoDb can handle a lot of queries, but requires a bit of planning in your access patterns. You must query by hash key and filter by either the range key or a local secondary index. The query must contain a single comparison operator using the familiar >=, BETWEEN, IN, etc and will sort result as well. If you require a query like SELECT col1 FROM table1 WHERE condition1 > x AND condition2 > y AND condition3 > z, you're not necessarily stuck but you need to plan. These queries can be made, but might require querying sequentially across multiple tables or embedding part of the query logic in the hash key (e.g. hashkey = BOUNCE_RATE:1 or BOUNCE_RATE:2, ...) where :N would be some sort of meaningful tier for bounce rates. In my own experience this is not unusual at all. A caveat in this example is that you will possibly get poor distribution of the hash key across nodes (i.e. you could get hot keys that degrade performance which would possibly defeat the scalability advantage of DynamoDB), but my explanation should simply serve to give you ways to thinking about access patterns.
Assuming bounce rate is some precise decimal value, you could put it in either the range key or in a local secondary index which would then require including the time interval in the hash key. This would require that the time intervals are pre-determined (as the first example would require the bounce rate "tiers" to be pre-determined). You'll need to consider these types of trade-offs.
Finally, you can create multiple tables, each table holding the data of either a single tier of bounce rate or time interval. There are also other basic approaches, but food for thought...
Upvotes: 1