Greg Smith
Greg Smith

Reputation: 43

postgres hierarchy - count of child levels and sort by date of children or grandchildren

I would like to know how to write a postgres subquery so that the following table example will output what I need.

id   parent_id    postdate

1      -1      2015-03-10 
2     1      2015-03-11  (child level 1)
3     1      2015-03-12  (child level 1)
4     3      2015-03-13  (child level 2)
5    -1      2015-03-14
6    -1      2015-03-15
7     6      2015-03-16  (child level 1)

If I want to sort all the root ids by child level 1 with a count of children(s) from the parent, the output would be something like this

id    count      date
6   2     2015-03-15
1   4     2015-03-10
5   1     2015-03-14

The output is sorted by postdate based on the root's child. The 'date' being outputted is the date of the root's postdate. Even though id#5 has a more recent postdate, the rootid#6's child (id#7) has the most recent postdate because it is being sorted by child's postdate. id#5 doesnt have any children so it just gets placed at the end, sorted by date. The 'count' is the number children(child level 1), grandchildren(child level 2) and itself (root). For instance, id #2,#3,#4 all belong to id#1 so for id#1, the count would be 4.

My current subquery thus far:

SELECT p1.id,count(p1.id),p1.postdate
 FROM mytable p1
     LEFT JOIN mytable c1 ON c1.parent_id = p1.id AND p1.parent_id = -1
     LEFT JOIN mytable c2 ON c2.parent_id = c1.id AND p1.parent_id = -1
GROUP BY p1.id,c1.postdate,p1.postdate
ORDER by c1.postdate DESC,p1.postdate DESC

Upvotes: 4

Views: 2122

Answers (4)

verhie
verhie

Reputation: 1318

select
 p.id,
 (1+c.n) as parent_post_plus_number_of_subposts,
 p.postdate
from 
    table as p
inner join

    (
    select
    parent_id, count(*) as n, max(postdate) as _postdate
    from table
    group by parent_id
    ) as c

on p.id = c.parent_id
where p.parent_id = -1
order by c._postdate desc

Upvotes: 0

joop
joop

Reputation: 4503

You'll need a recursive query to count the elements in the subtrees:

WITH RECURSIVE opa AS (
        SELECT id AS par
        , id AS moi
        FROM the_tree
        WHERE parent_id IS NULL
        UNION ALL
        SELECT o.par AS par
        , t.id AS moi
        FROM opa o
        JOIN the_tree t ON t.parent_id = o.moi
        )
SELECT t.id
        , c.cnt
        , t.postdate
FROM the_tree t
JOIN ( SELECT par, COUNT(*) AS cnt
        FROM opa o
        GROUP BY par
        ) c ON c.par = t.id
ORDER BY t.id
        ;

UPDATE (it appears the OP also wants the maxdate per tree)

        -- The same, but also select the postdate
        -- --------------------------------------
WITH RECURSIVE opa AS (
        SELECT id AS par
        , id AS moi
        , postdate AS postdate
        FROM the_tree
        WHERE parent_id IS NULL
        UNION ALL
        SELECT o.par AS par
        , t.id AS moi
        -- , GREATEST(o.postdate,t.postdate) AS postdate
        , t.postdate AS postdate
        FROM opa o
        JOIN the_tree t ON t.parent_id = o.moi
        )
SELECT t.id
        , c.cnt
        , t.postdate
        , c.maxdate
FROM the_tree t
JOIN ( SELECT par, COUNT(*) AS cnt
        , MAX(o.postdate) AS maxdate -- and obtain the max()
        FROM opa o
        GROUP BY par
        ) c ON c.par = t.id
ORDER BY c.maxdate, t.id
        ;

Upvotes: 1

Greg Smith
Greg Smith

Reputation: 43

After looking at everyone's code, I created the subquery I needed. I can use PHP to vary the 'case when' code depending on the user's sort selection. For instance, the code below will sort the root nodes based on child level 1's postdate.

with recursive cte as (
select id as parent, id as root, null::timestamp as child_postdate,0 as depth
from mytable
where parent_id = -1
union all
select r.parent, mytable.id, mytable.postdate,depth+1
from cte r
join mytable
  on parent_id = r.root
)
select m.id, c.cnt, m.postdate
from ssf.dtb_021 m
join ( select parent, count(*) as cnt, max(child_postdate) as max_child_date,depth
       from cte 
      group by parent,depth
     ) c on c.parent = m.id
order by 
  case
    when depth=2 then 1
    when depth=1 then 2
    else 0
  end DESC,
c.max_child_date desc nulls last, m.postdate desc;

Upvotes: 0

maletin
maletin

Reputation: 585

create table mytable ( id serial primary key, parent_id int references mytable, postdate date );
create index mytable_parent_id_idx on mytable (parent_id);
insert into mytable (id, parent_id, postdate) values (1, null, '2015-03-10');
insert into mytable (id, parent_id, postdate) values (2, 1, '2015-03-11');
insert into mytable (id, parent_id, postdate) values (3, 1, '2015-03-12');
insert into mytable (id, parent_id, postdate) values (4, 3, '2015-03-13');
insert into mytable (id, parent_id, postdate) values (5, null, '2015-03-14');
insert into mytable (id, parent_id, postdate) values (6, null, '2015-03-15');
insert into mytable (id, parent_id, postdate) values (7, 6, '2015-03-16');

with recursive recu as (
  select id as parent, id as root, null::date as child_postdate
    from mytable
   where parent_id is null
   union all
  select r.parent, mytable.id, mytable.postdate
    from recu r
    join mytable
      on parent_id = r.root
)
select m.id, c.cnt, m.postdate, c.max_child_date
  from mytable m
  join ( select parent, count(*) as cnt, max(child_postdate) as max_child_date
           from recu
          group by parent
       ) c on c.parent = m.id
 order by c.max_child_date desc nulls last, m.postdate desc;

Upvotes: 1

Related Questions