Moiz
Moiz

Reputation: 127

Select start/end date for each group

I have an employee job assignment table in the form shown below:

emp_id, dept_id, assignment,  start_dt,    end_dt
1,      10,      project 1,   2001-01-01,  2001-12-31
1,      10,      project 2,   2002-01-01,  2002-12-31
1,      20,      project 3,   2003-01-01,  2003-12-31
1,      20,      project 4,   2004-01-01,  2004-12-31
1,      10,      project 5,   2005-01-01,  2005-12-31

From the above table i need to summarize employee department history i.e the duration an employee worked for a particular department before being transferred to some other department.

Expected output result is shown below:

emp_id, dept_id,  start_dt,    end_dt
1,      10,       2001-01-01,  2002-12-31
1,      20,       2003-01-01,  2004-12-31
1,      10,       2005-01-01,  2005-12-31

I have tried to solve the above problem using the oracle analytics functions but was unable to get the desired output

    select distinct emp_id, dept_id, start_dt, end_dt 
    from ( 
       select emp_id, dept_id, 
              min(start_date) 
                 over (partition by emp_id, dept_id order by emp_id, dept_id 
                 RANGE BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW) as start_dt,
              max(end_date)   
                 over (partition by emp_id, dept_id order by emp_id, dept_id 
                 RANGE BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW) as end_dt
       from employee_job_assignment
    )
    where emp_id = 1;

The above query results in the following output:

emp_id, dept_id,  start_dt,    end_dt
1,      10,       2001-01-01,  2005-12-31
1,      20,       2003-01-01,  2004-12-31

Upvotes: 1

Views: 700

Answers (3)

Gordon Linoff
Gordon Linoff

Reputation: 1270653

This is a gaps and islands problem. But with a twist. In this case, you probably also want to take into account gaps within the same department. For instanct:

emp_id, dept_id, assignment,  start_dt,    end_dt
1,      10,      project 1,   2001-01-01,  2001-12-31
1,      10,      project 2,   2003-01-01,  2003-12-31

This should return two rows rather than one.

For this, determine where each island begins by comparing the previous end date to the current start date. That defines the beginning of a grouping. The rest is aggregation:

select emp_id, dept_id, min(start_dt), max(end_dt)
from (select eja.*,
             sum(case when prev_end_dt = start_dt - 1
                      then 0 else 1
                 end) over (partition by emp_id, dept_id) as grouping
      from (select eja.*,
                   lag(end_dt) over (partition by emp_id, dept_id order by start_dt) as prev_end_dt
            from employee_job_assignment eja
           ) eja
     ) eja
group by emp_id, dept_id, grouping;

Upvotes: 1

The Impaler
The Impaler

Reputation: 48850

The key to the solution is to separate the rows into groups according to your logic. You can do that using the LAG() function. For example:

select
  max(emp_id) as emp_id,
  max(dept_id) as dept_id,
  min(start_dt) as start_dt,
  max(end_dt) as end_dt
from (
  select
    *,
    sum(inc) over(partition by emp_id order by start_dt) as grp
  from (
    select
      *,
      case when lag(dept_id) over(partition by emp_id order by start_dt) 
                <> dept_id then 1 else 0 end as inc
    from employee_job_assignment
  ) x
) y
group by grp
order by grp

Upvotes: 1

Fahmi
Fahmi

Reputation: 37483

You can try below -

select emp_id,dept_id,min(start_Date) as start_Date,min(end_date) as end_date
from
(
select *,
row_number() over(order by start_date)-row_number() over(partition by dept_id order by start_date) as grp
from t
)A group by grp, dept_id,emp_id

OUTPUT:

emp_id  dept_id start_Date              end_date
 1       10      01/01/2001 00:00:00    31/12/2001 00:00:00
 1       10      01/01/2005 00:00:00    31/12/2005 00:00:00
 1       20      01/01/2003 00:00:00    31/12/2003 00:00:00

Upvotes: 1

Related Questions