Josh K
Josh K

Reputation: 28883

Postgres Query Tuning

I have a table that holds historical records. Whenever a count gets updated, a record is added specifying that a new value was fetched at that time. The table schema looks like this:

    Column     |           Type           |                             Modifiers
---------------+--------------------------+--------------------------------------------------------------------
 id            | integer                  | not null default nextval('project_accountrecord_id_seq'::regclass)
 user_id       | integer                  | not null
 created       | timestamp with time zone | not null
 service       | character varying(200)   | not null
 metric        | character varying(200)   | not null
 value         | integer                  | not null

Now I'd like to get the total number of records updated each day, for the last seven days. Here's what I came up with:

SELECT
    created::timestamp::date as created_date,
    count(created)
FROM
    project_accountrecord
GROUP BY
    created::timestamp::date
ORDER BY
    created_date DESC
LIMIT 7;

This runs slowly (11406.347ms). EXPLAIN ANALYZE gives:

Limit  (cost=440939.66..440939.70 rows=7 width=8) (actual time=24184.547..24370.715 rows=7 loops=1)
   ->  GroupAggregate  (cost=440939.66..477990.56 rows=6711746 width=8) (actual time=24184.544..24370.699 rows=7 loops=1)
         ->  Sort  (cost=440939.66..444340.97 rows=6802607 width=8) (actual time=24161.120..24276.205 rows=92413 loops=1)
               Sort Key: (((created)::timestamp without time zone)::date)
               Sort Method: external merge  Disk: 146328kB
               ->  Seq Scan on project_accountrecord  (cost=0.00..153671.43 rows=6802607 width=8) (actual time=0.017..10132.970 rows=6802607 loops=1)
 Total runtime: 24420.988 ms

There are a little over 6.8 million rows in this table. What can I do to increase performance of this query? Ideally I'd like it to run in under a second so I can cache it and update it in the background a couple of times a day.

Upvotes: 1

Views: 199

Answers (1)

Tomasz Myrta
Tomasz Myrta

Reputation: 1144

Now, your query must scan whole table, calculate result and limit to 7 recent days. You can speedup query by scanning only last 7 days (or more if you don't update records every day):

where created_date>now()::date-'7 days'::interval

Another aproach is to cache historical results in extra table and count only current day.

Upvotes: 2

Related Questions