Nuno Gomes
Nuno Gomes

Reputation: 97

Query optimization for postgresql

I have to resolve a problem in my class about query optimization in postgresql.

I have to optimize the following query.

"The query determines the yearly loss in revenue if orders just with a quantity of more than the average quantity of all orders in the system would be taken and shipped to customers."

select  sum(ol_amount) / 2.0 as avg_yearly
from    orderline, (select   i_id, avg(ol_quantity) as a
            from     item, orderline
            where    i_data like '%b'
                 and ol_i_id = i_id
            group by i_id) t
where   ol_i_id = t.i_id
    and ol_quantity < t.a

Is it possible through indices or something else to optimize that query (Materialized view is possible as well)?

Execution plan can be found here. Thanks.

Upvotes: 1

Views: 167

Answers (2)

MPękalski
MPękalski

Reputation: 7103

Remember that making indexes may not speed up the query when you have to retreive something like 30% of the table. In this case bitmap index might help you but as far as I remember it is not available in Postgres. So, think which table to index, maybe it would be worth to index the big table by ol_i_id as the join you are making only needs to match less than 10% of the big table and small table is loaded to ram (I might be mistaken here, but at least in SAS hash join means that you load the smaller table to ram).

You may try aggregating data before doing any joins and reuse the groupped data. I assume that you need to do everything in one query without explicitly creating any staging tables by hand. Also recently, I have been working a lot on SQL Server so I may mix the syntax, but give it a try. There are many assumptions I have made about the data and the structure of the table, but hopefully it will work.

;WITH GrOrderline (
  SELECT ol_i_id, ol_quantity, SUM(ol_amount) AS Yearly, Count(*) AS cnt
  FROM orderline 
  GROUP BY ol_i_id, ol_quantity
),
WITH AvgOrderline (
  SELECT 
    o.ol_i_id, SUM(o.ol_quantity)/SUM(cnt) AS AvgQ
  FROM GrOrderline AS o 
  INNER JOIN item AS i ON (o.ol_i_id = i.i_id AND RIGHT(i.i_data, 1) = 'b')
  GROUP BY o.ol_i_id
)
  SELECT SUM(Yearly)/2.0 AS avg_yearly
  FROM GrOrderline o INNER JOIN AvgOrderline a ON (a.ol_i_id = a.ol_i_id AND o.ol_quantity < a.AvG)

Upvotes: 0

Joe Love
Joe Love

Reputation: 5932

first if you have to do searches from the back of data, simply create an index on the reverse of the data

create index on item(reverse(i_data);

Then query it like so:

select  sum(ol_amount) / 2.0 as avg_yearly
from    orderline, (select   i_id, avg(ol_quantity) as a
            from     item, orderline
            where    reverse(i_data) like 'b%'
                 and ol_i_id = i_id
            group by i_id) t
where   ol_i_id = t.i_id
    and ol_quantity < t.a

Upvotes: 1

Related Questions