Reputation: 3
I'm trying to get a count of 'visit' records created by joining values from 3 tables. I have a simple query below but it takes almost 30 min on the db. Is there a way to optimize this query any further?
select a."ClientID" as ClientID, b."ProviderID" as ProviderID, count(1) as VisitCount
from "Log" c
inner join "MessageDetail" b on c."MessageDetailID" = b."MessageDetailID"
inner join "Message" a on a."MessageID" = b."MessageID"
where a."CreatedUTCDate" >= NOW() - INTERVAL '1 HOUR'
group by a."ClientID", b."ProviderID"
Example Result
ClientID ProviderID VisitCount
3245cf64-test-4d05-9d5d-345653566455 677777 1
3245cf64-test-4d05-9d5d-345653566455 677777 1
0284a326-test-4757-b00e-34563465dfgg 9999 5
Explain plan
GroupAggregate (cost=6529150.62..6529160.28 rows=483 width=48)
Group Key: a."ClientID", b."ProviderID"
-> Sort (cost=6529150.62..6529151.83 rows=483 width=40)
Sort Key: a."ClientID", b."ProviderID"
-> Nested Loop (cost=1.00..6529129.09 rows=483 width=40)
-> Nested Loop (cost=0.56..6509867.54 rows=3924 width=48)
-> Seq Scan on "Message" a (cost=0.00..6274917.96 rows=3089 width=44)
Filter: ("CreatedUTCDate" >= (now() - '01:00:00'::interval))
-> Index Scan using "ix_MessageDetail_MessageId" on "MessageDetail" b (cost=0.56..75.40 rows=66 width=20)
Index Cond: ("MessageID" = a."MessageID")
-> Index Only Scan using "ix_Log_MessageDetailId" on "Log" c (cost=0.43..4.90 rows=1 width=8)
Index Cond: ("MessageDetailID" = b."MessageDetailID")
Explain Analyze Plan
GroupAggregate (cost=6529127.35..6529137.01 rows=483 width=48) (actual time=791639.382..791661.555 rows=118 loops=1)
Group Key: a."ClientID", b."ProviderID"
-> Sort (cost=6529127.35..6529128.56 rows=483 width=40) (actual time=791639.373..791649.235 rows=64412 loops=1)
Sort Key: a."ClientID", b."ProviderID"
Sort Method: external merge Disk: 3400kB
-> Nested Loop (cost=1.00..6529105.82 rows=483 width=40) (actual time=25178.920..791410.769 rows=64412 loops=1)
-> Nested Loop (cost=0.56..6509844.55 rows=3924 width=48) (actual time=25178.874..790954.577 rows=65760 loops=1)
-> Seq Scan on "Message" a (cost=0.00..6274894.96 rows=3089 width=44) (actual time=25178.799..790477.178 rows=25121 loops=1)
Filter: ("CreatedUTCDate" >= (now() - '01:00:00'::interval))
Rows Removed by Filter: 30839080
-> Index Scan using "ix_MessageDetail_MessageId" on "MessageDetail" b (cost=0.56..75.40 rows=66 width=20) (actual time=0.009..0.016 rows=3 loops=25121)
Index Cond: ("MessageID" = a."MessageID")
-> Index Only Scan using "ix_Log_MessageDetailId" on "Log" c (cost=0.43..4.90 rows=1 width=8) (actual time=0.005..0.006 rows=1 loops=65760)
Index Cond: ("MessageDetailID" = b."MessageDetailID")
Heap Fetches: 65590
Planning time: 38.501 ms
Execution time: 791662.728 ms
Upvotes: 0
Views: 64
Reputation: 246063
This part of the execution plan
-> Seq Scan on "Message" a (...) (actual time=25178.799..790477.178 rows=25121 loops=1)
Filter: ("CreatedUTCDate" >= (now() - '01:00:00'::interval))
Rows Removed by Filter: 30839080
proves that an Index on "CreatedUTCDate"
would speed up this query quite a lot:
almost the complete execution time is spent in this sequential scan
you scan over 30 million rows to find 25000, so the filter condition is highly selective
Upvotes: 2