okrunner
okrunner

Reputation: 3193

Query optimization when using a JSON field

Running a PostgreSQL 9.6.4 on my laptop, I have a table called node which has a primary key id field and a properties::jsonb field.

I have a GIN index setup on the properties field.

When I run this query:

SELECT   n.*
FROM     node n
WHERE    node_type_id = '2'
AND      properties @> '{"slug":"wild-castles"}'::JSONB
ORDER BY n.id ASC OFFSET 0 LIMIT 10;

on ~5M rows table it takes about 20 seconds to get an answer. Looking into the explain plan I found out the query optimizer is first sorting the table by the primary key and then filtering by the properties field:

Limit  (cost=0.56..1517.94 rows=10 width=154)
  ->  Index Scan using node_pkey on node n  (cost=0.56..739571.11 rows=4874 width=154)
        Filter: ((properties @> '{"slug": "wild-castles"}'::jsonb) AND ((node_type_id)::text = '2'::text))

But when I remove the ordering I'm seeing the optimizer using the index as expected:

SELECT n.*
FROM   node n
WHERE  node_type_id = '2'
AND    properties @> '{"slug":"wild-castles"}'::JSONB
OFFSET 0 LIMIT 10;

Limit  (cost=93.77..127.10 rows=10 width=154)
  ->  Bitmap Heap Scan on node n  (cost=93.77..16338.56 rows=4874 width=154)
        Recheck Cond: (properties @> '{"slug": "wild-castles"}'::jsonb)
        Filter: ((node_type_id)::text = '2'::text)
        ->  Bitmap Index Scan on node_ix02  (cost=0.00..92.55 rows=4874 width=0)
              Index Cond: (properties @> '{"slug": "wild-castles"}'::jsonb)

Also, a simple WHERE properties @> '{"slug":"wild-castles"}'::JSONB behaves as expected:

EXPLAIN SELECT   n.*
FROM     node n
WHERE    properties @> '{"slug":"wild-castles"}'::JSONB
;

Bitmap Heap Scan on node n  (cost=93.77..16326.38 rows=4874 width=154)
  Recheck Cond: (properties @> '{"slug": "wild-castles"}'::jsonb)
  ->  Bitmap Index Scan on node_ix02  (cost=0.00..92.55 rows=4874 width=0)
        Index Cond: (properties @> '{"slug": "wild-castles"}'::jsonb)

So I guess I'm wondering why would the optimizer not use the index to filter out the rows first and then order them by the id field?

Upvotes: 11

Views: 505

Answers (2)

Tyler
Tyler

Reputation: 772

In my experience, you sometimes have to trick the query planner to get it to perform well, and it takes some tweaking and fiddling with...

I would try running this to see how it performs:

SELECT nn.* FROM (
    SELECT n.*
    FROM   node n
    WHERE  node_type_id = '2'
    AND    properties @> '{"slug":"wild-castles"}'::JSONB
) nn
ORDER BY nn.id ASC OFFSET 0 LIMIT 10;

Upvotes: 1

sandeep rawat
sandeep rawat

Reputation: 4957

Change Planner Method Configuration and force planer not to do seqscan

eg

      SET enable_seqscan = OFF;

       SELECT   n.*
        FROM     node n
               WHERE    node_type_id = '2'
               AND      properties @> '{"slug":"wild-castles"}'::JSONB
             ORDER BY n.id ASC OFFSET 0 LIMIT 10;

Upvotes: 1

Related Questions