Tarlen
Tarlen

Reputation: 3797

Index on JSON field with dynamic keys

I'm on PG 9.5 and I have a table Visitors(id, data::json)

Example:

Visitor(id: 1, data: {name: 'Jack', age: 33, is_user: true })

I'd like to perform queries like

The keys inside the data column user-specified and as such are dynamic.

Which index makes the most sense in this situation?

Upvotes: 6

Views: 4864

Answers (3)

radtek
radtek

Reputation: 36360

I believe the best approach here is to create a raw sql migration:

  1. Run ./manage.py makemigrations --empty yourApp where yourApp is the app of the model you want to change indexes for.

  2. Edit the migration i.e.

operations = [
    migrations.RunSQL("CREATE INDEX idx_content_taxonomies_categories ON common_content((taxonomies->>'categories'));")
]

Where idx_content_taxonomies_categories is the name of the index, common_content is your table, taxonomies is your JSONField, and categories in this case is the key you want to index.

That should do it. Cheers!

Upvotes: 0

Leonard
Leonard

Reputation: 561

You can look at additional extension JsQuery – is a language to query jsonb data type, it provides an additional functionality to jsonb (currently missing in PostgreSQL), such as a simple and effective way to search in nested objects and arrays, more comparison operators with indexes support. Read more here: https://github.com/postgrespro/jsquery.

In your cases, you can create jsonb_path_value_ops index:

CREATE INDEX idx_visitors ON visitors USING GIN (jsonb jsonb_path_value_ops);

and use the next queries:

select * from visitors where jsonb @@ 'name = "Jack" and age > 25';
select * from visitors where jsonb @@ 'not name = * and is_user=true';

Upvotes: 3

Avish
Avish

Reputation: 4626

You can use a GIN index on a jsonb column, which gives you generalized, dynamic indexing of keys and values inside JSON value.

CREATE TABLE visitors (
  id integer,
  data jsonb
);

CREATE INDEX idx_visitors_data ON cards USING GIN (data);

SELECT * FROM visitors
WHERE data -> 'is_user' AND NOT data ? 'name';

Unfortunately, GIN indexes don't support numeric range comparisons. So while you could still issue a query for visitors named Jack aged over 25:

SELECT * FROM visitors
WHERE data @> '{"name": "Jack"}' AND ((data ->> 'age')::integer) > 25;

This will only use the index to find the name "Jack", and possibly to find rows which have an "age" key, but the actual test that the ages are over 25 will be done as a scan over the matching rows.

Note that if you really need range comparisons, you can still add non-GIN indexes on specific paths inside the JSON value, if you expect them to appear often enough to make that worthwhile. For example, you could add an index on data -> 'age' that supports range comparisons:

CREATE INDEX idx_visitors_data_age ON visitors ( ((data ->> 'age')::integer) );

(note the extra parentheses; you'll get an error without them).

See this excellent blog post for further information.

Upvotes: 9

Related Questions