Dogukan Tufekci
Dogukan Tufekci

Reputation: 3108

RethinkDB: How to dynamically build merge queries?

I am building an API using Python Flask. I love the ease for coding merge queries on RethinkDB. Even better, I've noticed that it may be possible to write a thin layer to code dynamic merge queries in regards to user input.

Let's say we are building a chat app. Here is the example code: https://github.com/dogukantufekci/rethinkdb_dynamic_merge

Tables and fields:

  1. accounts: "id", "created_on", "name", "email", "password"
  2. conversations: "id", "created_on", "subject", "to" (list of participants)
  3. messages: "id", "created_on", "text", "conversation", "from"
  4. message_readers: "id", "message", "reader"

Query to merge all 4 tables:

r.table("accounts").map(lambda account: 
    account.merge({
        "conversations": r.table("conversations").filter(lambda conversation: 
            conversation["to"].contains(account["id"])).coerce_to("array").map(lambda conversation:
            conversation.merge({
                "to": conversation["to"].map(lambda account: 
                    r.table("accounts").get(account)).coerce_to("array"),
                "messages": r.table("messages").filter(lambda message:
                    message["conversation"] == conversation["id"]).coerce_to("array").map(lambda message:
                    message.merge({
                        "from": r.table("accounts").get(message["from"]),
                        "readers": r.table("message_readers").filter(lambda readers:
                            readers["message"] == message["id"]).coerce_to("array"),
                    }))
            }))
    })).run(g.db_connection)

Result:

[{
    "id": "account111",
    "created_on": 1392515093.252,  
    "name": "John Doe",
    "email": "[email protected]",
    "conversations": [
        {
            "id": "conversation111",
            "created_on": 1392515093.252,  
            "subject": "Merging Queries on RethinkDB",
            "to": [
                {
                    "id": "account111",
                    "created_on": 1392515093.252,  
                    "name": "John Doe", 
                    "email": "[email protected]", 
                }, 
                {
                    "id": "account222",
                    "created_on": 1392515604.123,  
                    "name": "Mark Bobby", 
                    "email": "[email protected]", 
                }, 
            ], 
            "messages": [
                {
                    "id": "message111",
                    "created_on": 1392515604.123,  
                    "text": "How do we dynamically build merge queries?", 
                    "conversation": "conversation111", 
                    "from": {
                        "id": "account111",
                        "created_on": 1392515093.252,  
                        "name": "John Doe",
                        "email": "[email protected]",
                    }, 
                    "readers": [
                        {
                            "id": "message_reader111", 
                            "created_on": 1392515604.123, 
                            "message": "message111",
                            "reader": "account111",
                        }, 
                        {
                            "id": "message_reader222", 
                            "created_on": 1392515604.123, 
                            "message": "message111",
                            "reader": "account222",
                        },
                    ],
                },
            ], 
        }, 
    ],        
}]

Great so far!

A simpler response needs to return account data with conversations; no messages:

[{
    "id": "account111",
    "created_on": 1392515093.252,  
    "name": "John Doe",
    "email": "[email protected]",
    "conversations": [
        {
            "id": "conversation111",
            "created_on": 1392515093.252,  
            "subject": "Merging Queries on RethinkDB",
            "to": [
                {
                    "id": "account111",
                    "created_on": 1392515093.252,  
                    "name": "John Doe", 
                    "email": "[email protected]", 
                }, 
                {
                    "id": "account222",
                    "created_on": 1392515604.123,  
                    "name": "Mark Bobby", 
                    "email": "[email protected]", 
                }, 
            ], 
        }, 
    ],        
}]

There are two ways to get this result:

  1. We can re-write a query:

    r.table("accounts").map(lambda account: 
        account.merge({
            "conversations": r.table("conversations").filter(lambda conversation: 
                conversation["to"].contains(account["id"])).coerce_to("array").map(lambda conversation:
                conversation.merge({
                    "to": conversation["to"].map(lambda account: 
                        r.table("accounts").get(account)).coerce_to("array"),
                }))
        })).run(g.db_connection)
    

    Disadvantage: If there's a need to create further queries for alternative field combinations, this is not the best practice, because it's not dynamic and there's a lot of repetition.

  2. We can modify the last line of the large query with pluck to select fields:

    })).pluck(["id", "created_on", "name", "email", {"conversations": ["id", "created_on", "subject", {"to": ["id", "created_on", "name", "email"]}]}]).run(g.db_connection)
    

    Advantage: It is dynamic as it enables users to pluck value as an argument via URL

    http://www.myproject.com/accounts/?pluck=["id", "created_on", "name", "email", {"conversations": ["id", "created_on", "subject", {"to": ["id", "created_on", "name", "email"]}]}]
    

    Disadvantage: Query does go consume a lot of computational energy to merge tables which we dont need in the final result.

So the challenge is to dynamically build queries by accepting pluck value from user.

You can easily notice two conventions:

  1. Each dict field has a query which accepts a dict object:

    "messages": r.table("messages").filter(lambda message:
        message["conversation"] == conversation["id"]).coerce_to("array").map(lambda message:
            message.merge({})
    
  2. Each non dict field has an independent query:

    "from": r.table("accounts").get(message["from"])
    

So how can we use all these pieces information and build our nice dynamic merge queries?

Upvotes: 2

Views: 841

Answers (1)

elbear
elbear

Reputation: 769

My suggestion is to give up the dynamic part. Instead, you should design the API in a RESTful way. That means, for your example, that if someone wants to access an account, they can send a GET request to /accounts/[identifier]. If you they want all the messages that an account sent or received, they send a GET request to /accounts/[identifier]/messages.

Also, why do you need a conversation object? I would change your db structure as follows:

  1. accounts: "id", "created_on", "name", "email", "password" - Unchanged
  2. conversations: - Removed
  3. messages: "id", "created_on", "text", "sender", "receivers"
  4. message_readers: - Removed

Upvotes: 1

Related Questions