Unfocused
Unfocused

Reputation: 13

Most optimal way to store nested information in a database

I want to store some nested information in a Postgres database and I am wondering what is the most optimal way to do so.

I have a list of cars for rent, structured like this:

[Brand] > [Model] > [Individual cars for rent of that brand and model], ex.:

[
{
    "id": 1,
    "name": "Audi",
    "models": [
        {
            "id": 1,
            "name": "A1",
            "cars": [
                {
                    "id": 1,
                    "license": "RPY9973",
                    "mileage": "41053"
                },
                {
                    "id": 2,
                    "license": "RPY3001",
                    "mileage": "102302"
                },
                {
                    "id": 3,
                    "license": "RPY9852",
                    "mileage": "10236"
                }
            ]
        },
        {
            "id": 2,
            "name": "A3",
            "cars": [
                {
                    "id": 1,
                    "license": "RPY1013",
                    "mileage": "66952"
                },
                {
                    "id": 2,
                    "license": "RPY3284",
                    "mileage": "215213"
                },
                {
                    "id": 3,
                    "license": "RPY0126",
                    "mileage": "19632"
                }
            ]
        }
        ...
    ]
}
...

]

Currently, having limited experience with databases and storing arrays, I am storing it in a 'brands' table with the following columns:

id (integer) - brand ID
name (text) - brand name
models (text) - contains stringified content of models and cars within them, which are parsed upon reading

In practice, this does the job, however I would like to know what the most efficient way would be.

For example, should I split the single table into three tables: 'brands', 'models' and 'cars' and have the tables reference each other (brands.models would be an array of unique model IDs, which I could use to read data from the 'models' table, and models.cars would be an array of unique car IDs, which I could use to read data from the 'cars' table)?

Upvotes: 1

Views: 3943

Answers (2)

David Shindler
David Shindler

Reputation: 109

The best solution to store the nested data in your postgres database is json or jsonb field. The benefits using json or jsonb are:

  • significantly faster to process, supports indexing (which can be a significant advantage),
  • simpler schema designs (replacing entity-attribute-value (EAV) tables with jsonb columns, which can be queried, indexed and joined, allowing for performance improvements up until 1000X)

Upvotes: 0

aarbor
aarbor

Reputation: 1534

Rather than store it as json, jsonb, or as arrays, the most efficient way to store the data would be to store it as relational data (excluding the data types for brevity):

create table brands(
 id,
 name, 
 /* other columns */
PRIMARY KEY (id)
);

create table models(
 id,
 name,
 brand_id REFERENCES brands(id),
/* other columns */
PRIMARY KEY (id)
);

create table cars(
 id,
 model_id REFERENCES models(id),
 mileage, 
 license,
/* other columns */
PRIMARY KEY (id)
);

You can then fetch and update each entity individually, without having to parse json. Partial updates is also much easier when you only have to focus on a single row, rather than worrying about updating arrays or json. For querying, you would join by the primary keys. For example, to get rental cars available by a brand:

select b.id, b.name, m.id, m.name, c.id, c.mileage, c.license
 FROM brands b
   LEFT JOIN models m 
   ON m.brand_id = b.id
   LEFT JOIN cars c 
   ON c.model_id = m.id
where b.id = ?

Based on querying / filtering patterns, you would then also want to create indexes on commonly used columns...

CREATE INDEX idx_car_model ON cars(model_id);
CREATE INDEX idx_model_brand ON models(brand_id);

Upvotes: 1

Related Questions