Przemysław P.
Przemysław P.

Reputation: 81

Improving performance of a select query

I've got this materialized view which aggregates data from multiple tables that store information about statistics data in my application. For now this view contains about 800.000 records. The problem is that this query runs pretty slow (about 1.5 second) and doesn't meet the demands of a customer. Is there some way to improve its performance? I'm using PostgreSQL 9.6.

I've tried to create indexes. But it doesn't help.

CREATE INDEX t1 ON statistic_basic_view (active, visible, removed, draft, id, name, object_type, is_paid, company_name);

CREATE INDEX t2 ON statistic_basic_view (company_name, is_paid, object_type, name, id, draft, removed, visible, active);

CREATE INDEX t3 ON statistic_basic_view (company_name, is_paid, object_type, name, id);

CREATE INDEX t4 ON statistic_basic_view (draft, removed, visible, active);

CREATE INDEX t5 ON statistic_basic_view (active, visible, removed, draft);

CREATE INDEX t6 ON statistic_basic_view (id, name, object_type, is_paid, company_name);

CREATE INDEX t8 ON statistic_basic_view (active, visible, removed, draft, id, name, object_type, is_paid, company_name);

CREATE INDEX t9 ON statistic_basic_view ((active AND visible AND (NOT removed) AND (NOT draft)));

CREATE INDEX t10 ON statistic_basic_view (((NOT draft) AND (NOT removed) AND active = true AND visible = true));

Query:

SELECT id,
       name,
       object_type,
       is_paid,
       company_name,
       SUM(CASE
             WHEN type = 'COMPARE'
               AND service_type IN ('GG_WEB') THEN 1
             ELSE 0
         END) AS compare_count,
       SUM(CASE
             WHEN type = 'EXPORT'
               AND service_type IN ('GG_WEB') THEN 1
             ELSE 0
         END) AS export_count,
       SUM(CASE
             WHEN type = 'VIEW'
               AND service_type IN ('GG_WEB') THEN 1
             ELSE 0
         END) AS view_count,
       SUM(CASE
             WHEN type = 'REMEMBER'
               AND service_type IN ('GG_WEB') THEN 1
             ELSE 0
         END) AS remember_count,
       SUM(CASE
             WHEN type = 'SEARCH'
               AND service_type IN ('GG_WEB') THEN 1
             ELSE 0
         END) AS search_count,
       SUM(CASE
             WHEN type = 'MAIL'
               AND service_type IN ('GG_WEB') THEN 1
             ELSE 0
         END) AS mail_count
FROM statistic_basic_view
WHERE active = TRUE
  AND visible = TRUE
  AND removed = FALSE
  AND draft = FALSE
GROUP BY id,
         name,
         object_type,
         is_paid,
         company_name
ORDER BY view_count DESC,
         id ASC
limit 15;

Explain analyse:

Limit  (cost=74204.47..74204.50 rows=15 width=130) (actual time=1420.542..1420.545 rows=15 loops=1)
  ->  Sort  (cost=74204.47..74600.55 rows=158432 width=130) (actual time=1420.540..1420.542 rows=15 loops=1)
        Sort Key: (sum(CASE WHEN ((type = 'VIEW'::text) AND ((service_type)::text = 'GG_WEB'::text)) THEN 1 ELSE 0 END)) DESC, id
        Sort Method: top-N heapsort  Memory: 28kB
        ->  HashAggregate  (cost=68733.10..70317.43 rows=158432 width=130) (actual time=1420.539..1420.542 rows=8988 loops=1)
              Group Key: id, name, object_type, is_paid, company_name
              ->  Seq Scan on statistic_basic_view  (cost=0.00..24950.65 rows=761434 width=94) (actual time=0.023..249.851 rows=762118 loops=1)
                    Filter: (active AND visible AND (NOT removed) AND (NOT draft))
                    Rows Removed by Filter: 30047
Planning time: 0.665 ms
Execution time: 1420.545 ms

Upvotes: 0

Views: 53

Answers (1)

Laurenz Albe
Laurenz Albe

Reputation: 246513

No index can help you with this query.

None of the WHERE conditions is selective, you can't speed up a GROUP BY with that many groups using an index, and you cannot use an index for the ordering (because there is a grouping by different criteria before that).

What you should do is create yet another materialized view on top of your materialized view (or directly on top of the base tables) that has the result pre-calculated and is regularly refreshed. This gives you slightly stale data, but fast.

Upvotes: 1

Related Questions