Reputation: 703
I have a window function that uses the rank() operation to match the most recent DHCP log event to an IP address to associate IP to host names. The problem is the query does not scale very well to large datasets and as such I want to try to rewrite it in terms of group by but I have not been successful.
create table large_table as
select column1, column2, column3, column4, column5, column6
from
(
select
a.column1, a.column2, a.start_time,
rank() OVER(
PARTITION BY a.column2, a.column1 order by a.start_time DESC
) as rank,
last_value( a.column3) OVER (
PARTITION BY a.column2, a.column1 order by a.start_time ASC
RANGE BETWEEN unbounded preceding and unbounded following
) as column3,
a.column4, a.column5, a.column6
from
(table2 s
INNER JOIN table3 t
ON s.column2=t.column2 and s.event_time > t.start_time
) a
) b
where rank =1;
Question 1: How do we rewrite the above query using group by instead of window functions?
Upvotes: 0
Views: 309
Reputation: 26464
I don't think it is going to be a winning strategy to do windows written as group-by in this sort of way. This would lead to significant performance headaches when you join aggregates from a table back against the main table.
A better approach is to use plpgsql and a function which adds the window information inside a loop. For example, rownumber could be added like this:
CREATE OR REPLACE FUNCTION foo_with_rownumber () RETURNS SETOF foo_with_rownumber
LANGUAGE PLPGSQL AS $$
DECLARE out_val foo_with_rownumber;
iter int;
BEGIN
iter := 1;
FOR out_val IN select f.*, 0 FROM foo order by bar
LOOP
out_val.rownumber = iter;
return next out_val;
iter := iter + 1;
END LOOP;
END;
$$;
rank() would be only slightly more complex to add so this should get you going.
Upvotes: 1