Using aggregate function in a CTE query

Question

I have this query

   WITH RECURSIVE territory_data(id,manager_id, name, geography) AS (
      SELECT r.id, r.manager_id, r.name, ST_Union(bd.data)
      FROM hierarchy h
        JOIN reps r ON r.hierarchy_id = h.id
        JOIN rep_sales_area sa ON r.id = sa.rep_id
        JOIN boundary_data bd ON sa.boundary_id = bd.id
      WHERE h.id = 1
      GROUP BY r.id

      UNION ALL 

      SELECT r.id, r.manager_id, r.name, ST_Union(t.geography) as geography
      FROM territory_data t, reps r
      WHERE t.manager_id = r.id 
      GROUP BY r.id 
    )

    select * from territory_data;

Essentially what I am trying to do is a sales rep controls different territories, they have a manager who controls all the territories of their subordinates. So essentially go up to the top of the chain and the executive would control the entire map. A sales rep can only have 1 manager, but a manager can have many subordinates. I am sure this is possible to do in a query without looping in code but I am new to postgres (coming from mysql) so still trying to figure out all these more advanced CTE functions.. window functions etc.

Any pointers would be greatly appreciated, thanks.

UPDATE

Say I have these tables and data

create table reps (id int NOT NULL primary key, hierarchy_id int NOT NULL,  name varchar(64) NOT NULL, manager_id int );
INSERT INTO reps (id, hierarchy_id, name, manager_id) VALUES 
(1,1, 'John Doe', 6),
(2,1, 'Jane Doe', 7), 
(3, 1, 'Mark Doe', 8), 
(4,1, 'Jake Doe',8),
(5, 1, 'Henry Doe', 6),

(6, 2, 'Derek Smith', 10),
(7, 2, 'Sam Smith', 9),
(8, 2, 'Debby Smith', 9),


(9, 3, 'Carl Burk', null),
(10, 3, 'Adrian Burk', null);


create table hierarchy (id int not null primary key, name varchar(32) NOT NULL);
insert into hierarchy(id, name) values 
(1, 'Sales Reps'),
(2, 'Managers'),
(3, 'Executives');

create table rep_sales_area(id int PRIMARY KEY NOT NULL, rep_id int NOT NULL, boundary_id INT NOT NULL); 
insert into rep_sales_area(id, rep_id, boundary_id) VALUES
(1, 1, 10 ),
(2, 1, 11 ),
(3, 1, 12 ),
(4, 2, 13 ),
(5, 2, 14 ),
(6, 3, 15 ),
(7, 4, 16 ),
(8, 4, 17 ),
(9, 4, 18 ),
(10, 5, 19);

This is the output I would like. Note - first layer doesnt matter (id 1-5) if it is easier to not include it

id | Name        | Manager     | boundaries
1  | John Doe    | Derek Smith | 10, 11, 12
2  | Jane Doe    | Sam Smith   | 13, 14
3  | Mark Doe    | Debby Smith | 15 
4  | Jake Doe    | Debby Smith | 16, 17, 18
5  | Henry Doe   | Derek Smith | 19
6  | Derek Smith | Adrian Buck | 10, 11, 12, 19
7  | Sam Smith   | Carl Buck   | 13, 14
8  | Debby Smith | Carl Buck   | 15, 16, 17, 18
9  | Carl Buck   |             | 13, 14, 15, 16, 17, 18
10 | Adrian Buck |             | 10, 11, 12, 19

Obviously I want the actual geography data but thought I would simplify it with just the ids for now.

Mike Jones · Accepted Answer

Edit: this version addresses your concern about the first version not supporting any level of hierarchy. It uses two recursive queries, where the second one feeds off the first. I'd never used a query with two levels of recursion, and am quite happy that it works (at least on DB2 and Postgres).

2nd Edit: I was finally able to get onto SQLFIDDLE.COM, without it being too busy, and refactored this for Postgres 9.3.1 instead of the prior DB2 version. Changes were fairly minor to make it happy. SQLFIDDLE link

Given this is fairly complicated and minimally tested with a small set of data, I suggest throwing a much larger set of data at it before putting this in production.

with recursive

associations as (

  select     rep_id as owner_id, boundary_id, id as rep_sales_area_id
  from       rep_sales_area

  union all

  select     s.manager_id as owner_id, a.boundary_id, a.rep_sales_area_id 
  from       associations a
  inner join reps s
        on   s.id = a.owner_id 
  where      s.manager_id is not null
) ,

territories as (

  select     r.id as owner_id
            ,'' as boundary_id_list
            ,0 as rep_sales_area_id
            ,1 as row_num
  from       reps r

  union all 

  select     t.owner_id
            ,trim( t.boundary_id_list ) || case when t.boundary_id_list = '' then '' else ', ' end || n.boundary_id
            ,n.rep_sales_area_id
            ,t.row_num + 1
  from       territories t
  cross join lateral (
    select   a.boundary_id, a.rep_sales_area_id
    from     associations a 
    where    a.rep_sales_area_id > t.rep_sales_area_id
         and a.owner_id = t.owner_id
    order by a.rep_sales_area_id
    limit    1
  ) as n 
)

select     id, name, manager, boundaries
from ( 
  select owner_id as id, r.name, m.name as manager, boundary_id_list as boundaries
        ,row_number() over( partition by owner_id 
                            order by     row_num desc ) as ordered_row_num 
  from       territories t
  left join  reps r
        on   r.id = t.owner_id
  left join  reps m
        on   m.id = r.manager_id
  ) as t

where      ordered_row_num = 1
order by   id

Using aggregate function in a CTE query

Answers (1)

Related Questions