Haya
Haya

Reputation: 21

Enhance Postgresql query (many subqueries)

I have a postgrsql query which takes a lot of time to execute (5 min) because of sub queries I think. I would like to find a way to enhance this query:

select v.id, v.pos, v.time,  v.status,  vi.name,vi.type,
        (select c.fullname
           from company c
          where vi.registered_owner_code = c.owcode ) AS registered_owner
       ,(select c.fullname
           from company c
         where vi.group_beneficial_owner_code=c.owcode) AS group_beneficial_owner
       ,(select c.fullname
           from company c
          where vi.operator_code = c.owcode ) AS operator
       ,(select c.fullname
           from company c
          where vi.manager_code = c.owcode ) AS manager
  from (car_pos v left join cars vi on v.id = vi.id)
 where age(now(), v.time::time with time zone) < '1 days'::interval

Upvotes: 2

Views: 137

Answers (5)

Erwin Brandstetter
Erwin Brandstetter

Reputation: 656616

3 key ingredients:

  1. Do away with the correlated subqueries, use JOIN instead - like other answers mentioned already.

  2. In the WHERE clause, don't use an expression on your column, which cannot utilize an index. @Frank already mentions it. Only the most basic, stable expressions can be rewritten by the query planner to use an index. See how I rewrote it.

  3. Create suitable indexes.

SELECT v.id, v.pos, v.time, v.status, c.name, c.type
      ,r.fullname AS registered_owner
      ,g.fullname AS group_beneficial_owner
      ,o.fullname AS operator
      ,m.fullname AS manager
FROM   car_pos v
LEFT   JOIN cars    c ON USING (id)
LEFT   JOIN company r ON r.owcode = c.registered_owner_code
LEFT   JOIN company g ON g.owcode = c.group_beneficial_owner_code
LEFT   JOIN company o ON o.owcode = c.operator_code
LEFT   JOIN company m ON m.owcode = c.manager_code
WHERE  v.time > (now() - interval '1 day');

You need unique indexes on cars.id and company.owcode (primary keys do the job, too).

And you need an index on car_pos.time like:

CREATE INDEX car_pos_time_idx ON car_pos (time DESC);

Works without descending order, too. If you have lots of rows (-> big table, big index), you might want to create a partial index that covers only recent history and recreate it on a daily or weekly basis at off hours:

CREATE INDEX car_pos_time_idx ON car_pos (time DESC);
WHERE time > $mydate

Where $mydate is the result of (now() - interval '1 day'). This matches the condition of your query logically at any time. Effectiveness slowly deteriorates over time.

Aside: don't name a column of type timestamp "time", that's misleading from a documentation point of view. Actually, rather don't use time as column name at all. It's a reserved word in every SQL standard and a type name in PostgreSQL.

Upvotes: 1

ppeterka
ppeterka

Reputation: 20726

One trivial solution would be to convert it to joins

select v.id, v.pos, v.time,  v.status,  vi.name,vi.type, 
reg_owner.fullname AS registered_owner, 
gr_ben_owner.fullname AS group_beneficial_owner, 
op.fullname AS operator, 
man.fullname AS manager
from  
  car_pos v 
  left join cars vi on v.id = vi.id
  left join company reg_owner on vi.registered_owner_code = reg_owner.owcode
  left join company gr_ben_owner on vi.group_beneficial_owner_code = gr_ben_owner.owcode
  left join company op on vi.operator_code = op.owcode
  left join company man on vi.manager_code  = man.owcode
where age(now(), v.time::time with time zone) < '1 days'::interval

I suspect however, that it might be possible by doing only one join of the table Company... I'm not 100% sure about the exact syntax to, and I have doubts that this will enhance performance (because of all the CASE-WHEN, GROUP by, etc) compared to the four time join solution, but I think this should work too. (I assumed, that cars-car_pos is a one-to-one relationship)

select v.id, MAX(v.pos) as pos, MAX(v.time) as vtime,  MAX(v.status) as status,  MAX(vi.name) as name,MAX(vi.type) as type, 
MAX(CASE WHEN c.owcode = vi.registered_owner_code THEN c.fullname END) AS registered_owner, 
MAX(CASE WHEN c.owcode = vi.group_beneficial_owner_code THEN c.fullname END) AS group_beneficial_owner, 
MAX(CASE WHEN c.owcode = vi.operator_code THEN op.fullname END) AS operator, 
MAX(CASE WHEN c.owcode = vi.manager_code THEN man.fullname END) AS manager
from  
  car_pos v 
  left join cars vi on v.id = vi.id
  left join company c on c.owcode IN (vi.registered_owner_code, vi.group_beneficial_owner_code, vi.operator_code, vi.manager_code)
group by v.id
having age(now(), vtime::time with time zone) < '1 days'::interval

If you could put the table creation DDL scripts, and some inserts into the question, it would be easy to try in SQL fiddle...

Upvotes: 0

Vikram Jain
Vikram Jain

Reputation: 5588

select v.id, v.pos, v.time,  v.status,  vi.name,vi.type,
c1.fullname as Registered_owner,
c2.fullname as group_beneficial_owner,          
c3.fullname AS operator,   
c4.fullname AS manager 

from car_pos v 
left outer join cars vi on v.id = vi.id
left outerjoin company c1 on vi.registered_owner_code=c1.owcode
left outerjoin company c2 on vi.group_beneficial_owner_code=c2.owcode
left outerjoin company c3 on vi.operator_code=c3.owcode
left outerjoin company c4 on vi.manager_code=c4.owcode
 where age(now(), v.time::time with time zone) < '1 days'::interval

Upvotes: 0

Marek
Marek

Reputation: 1878

because of subqueries I think

This is not really a guessing game. You can get query execution plan explanation in pgadmin or just under console

http://www.pgadmin.org/docs/1.4/query.html

http://www.postgresql.org/docs/current/static/sql-explain.html

then you can see what's going on and what takes that much time.

After analysis you can add indexes or change something else but at least you will know what needs to be changed.

Upvotes: 2

Frank Heikens
Frank Heikens

Reputation: 127066

The WHERE condition can't use an index, you have to change that one. v.time should not be in a volatile function, age() in this case.

Upvotes: 1

Related Questions