pelelter
pelelter

Reputation: 675

PostgreSQL - What should I put inside a JSON column

The data I want to store data that has this characteristics:

Here's how it would look like in a regular table:

Table

I'm having trouble figuring out which would be the better way to store this data in a database for this situation.

Bellow are the ideas I already had:

So my questions are:

  1. Is one of these solutions (or one that I have not thought about it) that is better for this case?
  2. Are there other factors, other than the ones presented here, that I should consider to make this decision?

Upvotes: 2

Views: 371

Answers (3)

Laurenz Albe
Laurenz Albe

Reputation: 246063

Unless you have very many columns (~ 100), it is usually better to use normal columns. NULL values don't take any storage space in PostgreSQL.

On the other hand, if you have queries that can use any of these columns in the WHERE condition, and you compare with =, a single GIN index on a jsonb might be better than having many B-tree indexes, because the index maintenance costs would be higher.

The definitive answer depends on the SQL statements that you plan to run on that table.

Upvotes: 2

Stephan Stieger
Stephan Stieger

Reputation: 1

Depends on how much data you want to store, but as long as it is finite it shouldn't make a big difference if it contains a lot of null's or not

Upvotes: 0

Gordon Linoff
Gordon Linoff

Reputation: 1269443

You have laid out the three options pretty well. Things to consider are:

  • Performance
  • Data size
  • Each of maintenance
  • Flexibility
  • Security

Note that you don't even allude to security considerations. But security at the table level is usually a tad simpler than at the column level and might be important for regulated data such as PII (personally identifiable information).

The primary strength of the JSON solution is flexibility. It is easy to add new columns. But you don't need that. JSON has a cost in data size and data type flexibility (notably JSON doesn't support date/times explicitly).

A multiple table solution requires duplicating the primary key but may result in much less storage overall if the columns really are sparse. The "may" may also depend on the data type. A NULL string for instance occupies less space than a NULL float in a table record.

The joins on multiple tables will be 1-1 on primary keys. These should be pretty fast.

What would I do? Unless the answer is obvious, I would dump the data into a single table with a bunch of columns. If that table starts to get unwieldy, then I would think about splitting it into separate tables -- but still have one table for the common columns. The details of one or multiple tables can be hidden behind a view.

Upvotes: 1

Related Questions