Reputation: 14304
I know the whole design should be based on natural aggregates (documents), however, I'm thinking to implement a separate table for localisations (lang, key, text) and then use keys in other tables. However, I was unable to find any example on doing this.
Upvotes: 66
Views: 101540
Reputation: 3829
When I have needed to do this I have made use of pandas in python to do the joins across tables in memory.
Its not ideal as like already said, dynamo DB is not a relational database, but there are times when you need to do something like maintain mapping between ID's in two tables and if this happens to you, using a library like pandas along with the SDK can help you out.
I have an application I am using dynamo DB on that I now wish I just opted to use postgres for.
Upvotes: 2
Reputation: 8406
With DynamoDB, rather than join I think the best solution is to store the data in the shape you later intend to read it.
If you find yourself requiring complex read queries you might have fallen into the trap of expecting DynamoDB to behave like an RDBMS, which it is not. Transform and shape the data you write, keep the read simple.
Disk is far cheaper than compute these days - don't be afraid to denormalise.
Upvotes: 40
Reputation: 46738
Update: This answer is well within the defined community guidelines and not a non-answer speaking only about a commercial solution.
One solution I have seen come up multiple times in this space is to sync from DynamoDB into a separate database that is more well suited for the types of operations you're looking for.
I wrote a blog about this topic comparing various approaches I've seen people take to this very problem, but I'll summarize some of the key takeaways here so you don't have to read all of it.
(Full Disclosure: I work on the product team @ Rockset) Check out the blog for more details on the individual approaches.
Upvotes: 32
Reputation: 41
Recently I have the same requirement to use join and aggregate function like avg and sum with dynamoDb, to solve this I used the Cdata JDBC driver and it worked perfectly. It support join as well aggregate functions. Although, I am also searching for the solution to avoid using cdata because of license cost of Cdata.
Upvotes: 1
Reputation: 53
I know that my response is slightly late, by a couple of years. However, I was able to dig up some additional information, regarding Amazon DynamoDB & Joins, which might benefit you (or perhaps another individual, who may stumble upon this discussion, while researching this information, in the future).
To get to the point, I was able to locate some documentation on the Amazon DynamoDB Website, which states that the Apache HiveQL Query Language can be utilized, to perform Joins on Amazon DynamoDB Tables, Columns & Data, etc.
Querying Data in DynamoDB (w/ HiveQL): https://docs.aws.amazon.com/amazondynamodb/latest/developerguide/EMRforDynamoDB.Querying.html
Working w/ Amazon DynamoDB & Apache Hive: https://docs.aws.amazon.com/amazondynamodb/latest/developerguide/EMRforDynamoDB.Tutorial.html
Processing Amazon DynamoDB Data with Apache Hive on Amazon EMR: https://docs.aws.amazon.com/amazondynamodb/latest/developerguide/EMRforDynamoDB.html
I hope this information helps someone out, if not the original poster.
Upvotes: 2
Reputation: 508
You must query the first table, then iterate through each item with a get request on the next table.
The other answers are unsatisfactory as 1) don't answer the question and, more importantly, 2) how can you design your tables in advance to knowing their future application? The technical debt is just too high to reasonably cover unbounded future possibilities.
My answer horribly inefficient but this is the only current solution to the posed question.
I eagerly await a better answer.
Upvotes: 15
Reputation: 751
You are correct, DynamoDB is not designed as a relational database and does not support join operations. You can think about DynamoDB as just being a set of key-value pairs.
You can have the same keys across multiple tables (e.g. document_IDs), but DynamoDB doesn't automatically sync them or have any foreign-key features. The document_IDs in one table, while named the same, are technically a different set than the ones in a different table. It's up to your application software to make sure that those keys are synced.
DynamoDB is a different way of thinking about databases and you might want to consider using a managed relational database such as Amazon Aurora: https://aws.amazon.com/rds/aurora/
One thing to note, Amazon EMR does allow DynamoDB tables to be joined, but I'm not sure that's what you're looking for: http://docs.aws.amazon.com/ElasticMapReduce/latest/DeveloperGuide/EMRforDynamoDB.html
Upvotes: 64