Superdooperhero
Superdooperhero

Reputation: 8096

Oracle SQL how to group by, but have multiple rows if group is repeated at a later date

I have the following query

select paaf.assignment_id,
       paaf.position_id,
       paaf.effective_start_date effective_start_date,
       paaf.effective_end_date   effective_end_date
from   per_all_assignments_f paaf
where  paaf.position_id is not null
and    paaf.assignment_type in ('E', 'C')
and    paaf.primary_flag = 'Y'
and    paaf.assignment_number like '209384%'
order  by 3

Which returns

"assignment_id" "position_id"   "effective_start_date"  "effective_end_date"
6518    5323    01/01/2013  28/02/2014
6518    8133    01/03/2014  30/06/2014
6518    8133    01/07/2014  31/10/2015
6518    239570  01/11/2015  15/11/2015
6518    239570  16/11/2015  31/12/2015
6518    8133    01/01/2016  27/07/2016
6518    8133    28/07/2016  31/12/4712

I grouped this using:

select paaf.assignment_id,
       paaf.position_id,
       min(paaf.effective_start_date) effective_start_date,
       max(paaf.effective_end_date)   effective_end_date
from   per_all_assignments_f paaf
where  paaf.position_id is not null
and    paaf.assignment_type in ('E', 'C')
and    paaf.primary_flag = 'Y'
and    paaf.assignment_number like '209384%'
group  by paaf.assignment_id, paaf.position_id

Which returns:

"assignment_id" "position_id"   "effective_start_date"  "effective_end_date"
6518    5323    01/01/2013  28/02/2014
6518    8133    01/03/2014  31/12/4712
6518    239570  01/11/2015  31/12/2015

But I need a query that returns

"assignment_id" "position_id"   "effective_start_date"  "effective_end_date"
6518    5323    01/01/2013  28/02/2014
6518    8133    01/03/2014  31/10/2015
6518    239570  01/11/2015  31/12/2015
6518    8133    01/01/2016  31/12/4712

That is to say the position_id of 8133 must have two rows since there are two sections chronologically that must be grouped into 2 rows and not 1 (for 8133).

Is there some way of accomplishing this using the date order?

The answer turned out to be:

with paaf as
            (
               select paaf.assignment_id,
                      paaf.position_id,
                      paaf.effective_start_date effective_start_date,
                      paaf.effective_end_date   effective_end_date
               from   per_all_assignments_f paaf
               where  paaf.position_id is not null
               and    paaf.assignment_type in ('E', 'C')
               and    paaf.primary_flag = 'Y'
            -- and    paaf.assignment_number like '209384%'
               order  by 1, 3
            )
            select paaf2.assignment_id,
                   paaf2.position_id,
                   min(paaf2.effective_start_date) as effective_start_date,
                   max(paaf2.effective_end_date)   as effective_end_date
            from   (
                      select paaf.*,
                             row_number() over (order by paaf.assignment_id, paaf.effective_start_date) as seqnum,
                             row_number() over (partition by paaf.assignment_id, paaf.position_id order by paaf.assignment_id, paaf.effective_start_date) as seqnum_p
                      from   paaf
                   )  paaf2
            group  by (paaf2.seqnum - paaf2.seqnum_p), paaf2.assignment_id, paaf2.position_id    

Upvotes: 0

Views: 87

Answers (1)

Gordon Linoff
Gordon Linoff

Reputation: 1269483

This is a gaps-and-islands problem. There are different approaches, but a simple one uses a difference of row number:

with paaf as (<your first query here>
     )
select paaf.assignment_id,
       paaf.position_id,
       min(paaf.effective_start_date) as effective_start_date,
       max(paaf.effective_end_date) as effective_end_date
from (select paaf.*,
             row_number() over (order by effective_start_date) as seqnum,
             row_number() over (partition by position_id order by effective_start_date) as seqnum_p
      from paaf
     ) paaf
group by (seqnum - seqnum_p), position_id, assignment_id;

Upvotes: 2

Related Questions