ResonateAlpha
ResonateAlpha

Reputation: 53

Storing large JSON data in Postgres is infeasible, so what are the alternatives?

I have large JSON data, greater than 2kB, in each record of my table and currently, these are being stored in JSONB field. My tech stack is Django and Postgres. I don't perform any updates/modifications on this json data but i do need to read it, frequently and fast. However, due to the JSON data being larger than 2kB, Postgres splits it into chunks and puts it into the TOAST table, and hence the read process has become very slow. So what are the alternatives? Should i use another database like MongoDB to store these large JSON data fields? Note: I don't want to pull the keys out from this JSON and turn them into columns. This data comes from an API.

Upvotes: 5

Views: 15724

Answers (2)

Nico Toub
Nico Toub

Reputation: 1558

If you just need to store and read fully this json object without using the json structure in your WHERE query, what about simply storing this data as binary in a bytea column? https://www.postgresql.org/docs/current/datatype-binary.html

Upvotes: 1

Adam Charnock
Adam Charnock

Reputation: 1626

It is hard to answer specifically without knowing the details of your situation, but here are some things you may try:

  1. Use Postgres 12 (stored) generated columns to maintain the fields or smaller JSON blobs that are commonly needed. This adds storage overhead, but frees you from having to maintain this duplication yourself.
  2. Create indexes for any JSON fields you are querying (Postgresql allows you to create indexes for JSON expressions).
  3. Use a composite index, where the first field in the index the field you are querying on, and the second field (/json expression) is that value you wish to retrieve. In this case Postgresql should retrieve the value from the index.
  4. Similar to 1, create a materialised view which extracts the fields you need and allows you to query them quickly. You can add indexes to the materialised view too. This may be a good solution as materialised views can be slow to update, but in your case your data doesn't update anyway.
  5. Investigate why the toast tables are being slow. I'm not sure what performance you are seeing, but if you really do need to pull back a lot of data then you are going to need fast data access whatever database you choose to go with.

Your mileage may vary with all of the above suggestions, especially as each will depend on your particular use case. (see the questions in my comment)

However, the overall idea is to use the tools that Postgresql provides to make your data quickly accessible. Yes this may involve pulling the data out of its original JSON blob, but this doesn't need to be done manually. Postgresql provides some great tools for this.

Upvotes: 6

Related Questions