How to optimize query if I already use GIN index

Question

I use (PostgreSQL) 11.8 and I try to provide full text search opportunity by some columns. For that I created GIN index with multiple fields and coalesce. And after my data base grewto 344747 rows in table products I faced with slow execution in my query. When I execute query example I faced with approximately 4.6s. In analyzing information I see my index used, but still slowly. Bitmap Index Scan on npdbcs_swedish_custom_index present. If I correct made conclusion many time spent to grouping. Do someone know any approach or suggestion how to optimize this query, because I can't imagine how it will work when my db grows to 10 million productsю And most hope data from ANALYZE - Planning Time: 0.790 ms it' possible?

->  GroupAggregate  (cost=27625.67..47476.93 rows=10284 width=928) (actual time=32.328..4490.279 rows=6634 loops=1)

my index

create index npdbcs_swedish_custom_index on products
                    using GIN(to_tsvector('pg_catalog.swedish', name||price||description||brand||category||shop))

and my query

 EXPLAIN ANALYZE
        SELECT                         
            products_alias.id,
            products_alias.sku,
            products_alias.name AS "name",
            products_alias.description,
            products_alias.category,
            products_alias.price,
            products_alias.shipping,
            products_alias.currency,
            products_alias.instock,
            products_alias.product_url AS "productUrl",
            products_alias.image_url AS "imageUrl",
            products_alias.tracking_url AS "trackingUrl",
            products_alias.brand,
            products_alias.shop,
            products_alias.original_price AS "originalPrice",
            products_alias.ean,
            products_alias.manufacturer_article_number AS "manufacturerArticleNumber",
            products_alias.extras,
            products_alias.created_at AS "createdAt",
            products_alias.brand_relation_id AS "brandRelationId",
            products_alias.shop_relation_id AS "shopRelationId",
            array_agg(DISTINCT cpt.category_id) AS categoryIds,
            COUNT(DISTINCT uip.id) as "numberOfEntries",
            ts_rank_cd(to_tsvector('pg_catalog.swedish', products_alias.name||products_alias.price||products_alias.description||products_alias.brand||products_alias.category||products_alias.shop), to_tsquery('pg_catalog.swedish', 'AeroMoov&Air&Layer™&Group&2&Sittdyna&Mörkgrå&One&Size:*|329.00|AeroMoov|Liggdelar|Duovagnar|Barnvagnar|Solskydd')) AS rank

        FROM products products_alias 
        LEFT JOIN user_ip_product uip on uip.products_id = products_alias.id 
                LEFT JOIN product_category cpt on cpt.product_id = products_alias.id
                WHERE to_tsvector('pg_catalog.swedish', products_alias.name||products_alias.price||products_alias.description||products_alias.brand||products_alias.category||products_alias.shop) @@ to_tsquery('pg_catalog.swedish', 'AeroMoov&Air&Layer™&Group&2&Sittdyna&Mörkgrå&One&Size:*|329.00|AeroMoov|Liggdelar|Duovagnar|Barnvagnar|Solskydd')             
        AND products_alias.id != 810429                                         
        GROUP BY products_alias.id ORDER BY rank DESC                                          
        LIMIT 4
        OFFSET 0;



Limit  (cost=47631.19..47631.20 rows=4 width=928) (actual time=4401.654..4401.656 rows=4 loops=1)
  ->  Sort  (cost=47631.19..47656.90 rows=10284 width=928) (actual time=4401.652..4401.653 rows=4 loops=1)
        Sort Key: (ts_rank_cd(to_tsvector('swedish'::regconfig, ((((((products_alias.name)::text || (products_alias.price)::text) || products_alias.description) || (products_alias.brand)::text) || (products_alias.category)::text) || (products_alias.shop)::text)), '''aeromoov'' & ''air'' & ''layer'' & ''group'' & ''2'' & ''sittdyn'' & ''mörkgrå'' & ''one'' & ''siz'':* | ''329.00'' | ''aeromoov'' | ''liggdel'' | ''duovagn'' | ''barnvagn'' | ''solskyd'''::tsquery)) DESC
        Sort Method: top-N heapsort  Memory: 31kB
        ->  GroupAggregate  (cost=27625.67..47476.93 rows=10284 width=928) (actual time=31.780..4382.917 rows=6634 loops=1)
              Group Key: products_alias.id
              ->  Gather Merge  (cost=27625.67..44420.18 rows=20203 width=892) (actual time=31.096..88.389 rows=13451 loops=1)
                    Workers Planned: 2
                    Workers Launched: 2
                    ->  Nested Loop Left Join  (cost=26625.65..41088.23 rows=8418 width=892) (actual time=24.549..52.293 rows=4484 loops=3)
                          ->  Merge Left Join  (cost=26625.22..26646.92 rows=4285 width=888) (actual time=24.513..29.039 rows=2212 loops=3)
                                Merge Cond: (products_alias.id = uip.products_id)
                                ->  Sort  (cost=26622.18..26632.89 rows=4285 width=884) (actual time=24.430..26.065 rows=2211 loops=3)
                                      Sort Key: products_alias.id
                                      Sort Method: external merge  Disk: 2808kB
                                      Worker 0:  Sort Method: quicksort  Memory: 3261kB
                                      Worker 1:  Sort Method: quicksort  Memory: 2112kB
                                      ->  Parallel Bitmap Heap Scan on products products_alias  (cost=283.70..26363.68 rows=4285 width=884) (actual time=8.825..17.986 rows=2211 loops=3)
                                            Recheck Cond: (to_tsvector('swedish'::regconfig, ((((((name)::text || (price)::text) || description) || (brand)::text) || (category)::text) || (shop)::text)) @@ '''aeromoov'' & ''air'' & ''layer'' & ''group'' & ''2'' & ''sittdyn'' & ''mörkgrå'' & ''one'' & ''siz'':* | ''329.00'' | ''aeromoov'' | ''liggdel'' | ''duovagn'' | ''barnvagn'' | ''solskyd'''::tsquery)
                                            Filter: (id <> 810429)
                                            Rows Removed by Filter: 0
                                            Heap Blocks: exact=2563
                                            ->  Bitmap Index Scan on npdbcs_swedish_custom_index  (cost=0.00..281.13 rows=10284 width=0) (actual time=10.858..10.859 rows=6635 loops=1)
                                                  Index Cond: (to_tsvector('swedish'::regconfig, ((((((name)::text || (price)::text) || description) || (brand)::text) || (category)::text) || (shop)::text)) @@ '''aeromoov'' & ''air'' & ''layer'' & ''group'' & ''2'' & ''sittdyn'' & ''mörkgrå'' & ''one'' & ''siz'':* | ''329.00'' | ''aeromoov'' | ''liggdel'' | ''duovagn'' | ''barnvagn'' | ''solskyd'''::tsquery)
                                ->  Sort  (cost=3.05..3.18 rows=53 width=8) (actual time=0.070..0.086 rows=59 loops=3)
                                      Sort Key: uip.products_id
                                      Sort Method: quicksort  Memory: 27kB
                                      Worker 0:  Sort Method: quicksort  Memory: 27kB
                                      Worker 1:  Sort Method: quicksort  Memory: 27kB
                                      ->  Seq Scan on user_ip_product uip  (cost=0.00..1.53 rows=53 width=8) (actual time=0.030..0.040 rows=59 loops=3)
                          ->  Index Scan using idx_cdfc73564584665a on product_category cpt  (cost=0.42..3.34 rows=3 width=8) (actual time=0.007..0.009 rows=2 loops=6636)
                                Index Cond: (product_id = products_alias.id)
Planning Time: 0.660 ms
Execution Time: 4402.940 ms

would be great if it spent 0.660 ms. How to optimize this query ?

Laurenz Albe · Accepted Answer

Try increasing work_mem in the hope that you can get a more efficient hash aggregate.

I admit that I find it surprising that the time is spent in the group aggregate ...

How to optimize query if I already use GIN index

Answers (2)

Related Questions