zane
zane

Reputation: 1

improve performance for postgres

I count the number of users in this way, it runs for 5 seconds to produce results, I am looking for a better solution

SELECT COUNT(*)
FROM (SELECT user_id
      FROM slot_result_primary
      WHERE session_timestamp BETWEEN 1590598800000 AND 1590685199999
      GROUP BY user_id) AS foo

Upvotes: 0

Views: 127

Answers (2)

Dominik Boszko
Dominik Boszko

Reputation: 186

Counting is a very heavy operation in Postgres. It should be avoided if possible. It is very difficult to make it better so for each row Postgress needs to go the the disc. You can indeed create a better index to choose which rows to pick from the disc faster but even with this count time will always go up in time in a linear time compared to the size of the data.

Your index should be:

CREATE INDEX session_timestamp_user_id_index ON slot_result_primary (session_timestamp, user_id)

for best results.

Still an index will not solve your count problems fully. In a similar situation I faced two days ago (with a SELECT query running 3s and count running 1s) dedicated indexes allowed to push down the time of select to 0,3ms but best I could do with count was 700ms.

Here you can find a good article with a summary why count is difficult and different ways to make it better: https://www.citusdata.com/blog/2016/10/12/count-performance/

Upvotes: 0

Milney
Milney

Reputation: 6417

First of all you can simplify the query:

SELECT COUNT(DISTINCT user_id)
FROM slot_result_primary
WHERE session_timestamp BETWEEN 1590598800000 AND 1590685199999

Most importantly - make sure you have an index on sesion_timestamp

Upvotes: 2

Related Questions