Bill Roberts
Bill Roberts

Reputation: 17

How to derive years of service for employees that have termed and returned several times

I'm working with Oracle and have sample data to show you regarding my question (sample data is below).

I'm trying to figure out how to derive years of service for employees that have terminated and returned several times (up to 6 times). The business rules are that if someone leaves and returns before 365 days, then we bridge their service -- otherwise we don't. Bridging is determined by the "Yes" and "No" values in the BRIDGE fields.

My sample data below: The HIRE, TERM and BRIDGE fields go up to 3 levels but it can at times go all the way up to 6 levels deep for some employees.

╔════════╦════════════╦════════════╦══════════╦════════════╦═══════════╦══════════╦═══════════╗
║ EMPLID ║   HIRE_1   ║   TERM_1   ║ BRIDGE_1 ║   HIRE_2   ║  TERM_2   ║ BRIDGE_2 ║  HIRE_3   ║
╠════════╬════════════╬════════════╬══════════╬════════════╬═══════════╬══════════╬═══════════╣
║   1001 ║ 9/22/2011  ║ 12/9/2011  ║ Yes      ║ 7/1/2012   ║ 7/2/2013  ║ Yes      ║ 5/3/2014  ║
║   1002 ║ 3/29/1999  ║ 6/30/1999  ║ Yes      ║ 1/24/2000  ║ 3/29/2004 ║ No       ║ 11/5/2007 ║
║   1003 ║ 8/16/2009  ║ 12/31/2009 ║ Yes      ║ 1/1/2010   ║ 5/15/2012 ║ Yes      ║ 9/5/2012  ║
║   1004 ║ 7/23/2007  ║ 8/21/2009  ║ Yes      ║ 3/1/2010   ║ 2/9/2011  ║ Yes      ║ 4/25/2011 ║
║   1005 ║ 5/11/2010  ║ 8/31/2010  ║ Yes      ║ 7/1/2011   ║ 5/5/2012  ║ Yes      ║ 9/6/2012  ║
║   1006 ║ 1/17/2009  ║ 5/8/2009   ║ Yes      ║ 12/8/2009  ║ 5/4/2010  ║ Yes      ║ 1/18/2011 ║
║   1007 ║ 9/20/2004  ║ 5/4/2007   ║ No       ║ 11/15/2010 ║ 1/21/2011 ║ Yes      ║ 2/16/2011 ║
║   1008 ║ 6/22/2009  ║ 8/29/2009  ║ No       ║ 5/9/2011   ║ 4/13/2012 ║ Yes      ║ 5/21/2012 ║
║   1009 ║ 10/13/2008 ║ 5/9/2009   ║ Yes      ║ 8/24/2009  ║ 5/25/2010 ║ Yes      ║ 8/10/2010 ║
║   1010 ║ 8/13/2008  ║ 5/7/2010   ║ Yes      ║ 1/4/2011   ║ 5/31/2011 ║ Yes      ║ 8/15/2011 ║
║   5011 ║ 7/3/1985   ║ 10/24/2000 ║ No       ║ 7/19/2010  ║           ║ No       ║           ║
╚════════╩════════════╩════════════╩══════════╩════════════╩═══════════╩══════════╩═══════════╝

For the last entry, 5011, the employee was rehired on 7/19/2010 and hasn't left. So the calculation should be a little less than 5 years (4.xx). Because the prior termination date between the next hire date was more than 365 days, bridge_1 is "No". So this resets years recognized for service. Otherwise if the employee was rehired less than 365 days from the prior term, then the bridge_1 would be yes.

I'm looking for the most recent period of bridged employment, so if there are two periods separated by a gap of more than a year (as for employee 5011) then I only want the most recent, not the total of all the separate contiguous (bridged) periods.

Upvotes: 0

Views: 400

Answers (2)

Alex Poole
Alex Poole

Reputation: 191275

If you were working with the underlying data rather than the pivoted view you could use analytics and a gaps-and-isands approach. It's overkill here as @PonderStibbons pointed out1 but you could unpivot what you have back to something close to its original form:

with cte1 as (
  select t.emplid, t.num, t.hire, nvl(t.tmp_term, sysdate) as term,
    bridge as prev_bridge,
    nvl(lead(bridge) over (partition by emplid order by num), 'No') as next_bridge
  from (select t42.*, null as term_3, 'No' as bridge_0 from t42)
  unpivot ((hire, tmp_term, bridge) for num in (
    (hire_1, term_1, bridge_0) as 1,
    (hire_2, term_2, bridge_1) as 2,
    (hire_3, term_3, bridge_2) as 3)) t
)
select * from cte1;

    EMPLID        NUM HIRE       TERM       PRE NEX
---------- ---------- ---------- ---------- --- ---
      1001          1 2011-09-22 2011-12-09 No  Yes
      1001          2 2012-07-01 2013-07-02 Yes Yes
      1001          3 2014-05-03 2015-06-16 Yes No 
      1002          1 1999-03-29 1999-06-30 No  Yes
      1002          2 2000-01-24 2004-03-29 Yes No 
      1002          3 2007-11-05 2015-06-16 No  No 
...

This adds dummy term_3 and bridge_0 columns to the table; then unpivots it so you get one row for each set of entires; then it defaults the last open-ended hire to end at today's date, on the assumption you want their total hire time up to today. Because the default behaviour of unpivot is to exclude nulls, this will not create rows for unused 'levels', just up to the last hire date.

I've only used the three sets, but you can easily extend this to all six by changing the dummy column names to term_6 etc. and adding more in clauses.

From that you can see the contiguous periods, and can ignore any row where which is bridging both sides - where prev_bridge and next_bridge are both Yes. (Normally you'd use both lead and lag to work this out from the dates, but might as well use the bridge flags we already have).

with cte1 as (
  ...
),
cte2 as (
  select emplid, num,
    case when prev_bridge = 'No' then hire end as hire,
    case when next_bridge = 'No' then term end as term
  from cte1
  where prev_bridge = 'No' or next_bridge = 'No'
)
select * from cte2
order by emplid, num;

    EMPLID        NUM HIRE       TERM     
---------- ---------- ---------- ----------
      1001          1 2011-09-22           
      1001          3            2015-06-16
      1002          1 1999-03-29           
      1002          2            2004-03-29
      1002          3 2007-11-05 2015-06-16
...

From that you can get the total period covered by each contiguous group, by using lead and lag to compress the periods split across rows (with null hire or term), and using distinct since that will generate duplicates:

with cte1 as (
  ...
),
cte2 as (
  ...
),
cte3 as (
  select distinct emplid,
    case when hire is null then lag(num)
      over (partition by emplid order by num) else num end as num,
    months_between(case when term is null then lead(term)
        over (partition by emplid order by num) else term end,
      case when hire is null then lag(hire) 
        over (partition by emplid order by num) else hire end) as period_months
  from cte2
)
select emplid, period_months
from cte3
order by emplid, num;

    EMPLID PERIOD_MONTHS
---------- -------------
      1001         44.83
      1002         60.00
      1002         91.38
...

And finally sum those distinct periods:

with cte1 as (
  select t.emplid, t.num, t.hire,
    nvl(t.tmp_term, sysdate) as term, bridge as prev_bridge,
    nvl(lead(bridge) over (partition by emplid order by num), 'No') as next_bridge
  from (select t42.*, null as term_3, 'No' as bridge_0 from t42)
  unpivot ((hire, tmp_term, bridge) for num in (
    (hire_1, term_1, bridge_0) as 1,
    (hire_2, term_2, bridge_1) as 2,
    (hire_3, term_3, bridge_2) as 3)) t
),
cte2 as (
  select emplid, num,
    case when prev_bridge = 'No' then hire end as hire,
    case when next_bridge = 'No' then term end as term
  from cte1
  where prev_bridge = 'No' or next_bridge = 'No'
),
cte3 as (
  select distinct emplid,
    case when hire is null then lag(num)
      over (partition by emplid order by num) else num end as num,
    months_between(case when term is null then lead(term)
        over (partition by emplid order by num) else term end,
      case when hire is null then lag(hire) 
        over (partition by emplid order by num) else hire end) as period_months
  from cte2
)
select emplid, sum(period_months) as total_months,
  round(sum(period_months)/12, 2) as total_years
from cte3
group by emplid
order by emplid;

    EMPLID TOTAL_MONTHS TOTAL_YEARS
---------- ------------ -----------
      1001        44.83        3.74
      1002       151.38       12.62
      1003        70.00        5.83
...

You said you wanted the answer in years, so you can divide by 12, and optionally truncate (or round, or floor, or ceil); you may want to do that before you sum or after you sum, which may give different results sometimes, but I've done if after here just as a demo.

SQL Fiddle with full results, including for a row that only has the first hire date, and a row with two gaps.


If you're only interested in the most recent contiguous/bridged period, this can be simplified to:

with cte1 as (
  select t.emplid, t.num,
    case when t.bridge = 'No' then t.hire else null end as hire,
    case when t.hire is not null and t.term is null
      then sysdate else t.term end as term
  from (select emplid, 'No' as bridge_0, hire_1, term_1,
    nvl2(hire_2, bridge_1, null) as bridge_1, hire_2, term_2,
    nvl2(hire_3, bridge_2, null) as bridge_2, hire_3, null as term_3
    from t42
  )
  unpivot ((hire, term, bridge) for num in (
    (hire_1, term_1, bridge_0) as 1,
    (hire_2, term_2, bridge_1) as 2,
    (hire_3, term_3, bridge_2) as 3)) t
)
select emplid,
  round(months_between(max(term), max(hire)) / 12, 2) as total_years
from cte1
group by emplid
order by emplid;

    EMPLID TOTAL_YEARS
---------- -----------
      1001        3.74
      1002        7.62
      1003        5.84
      1004        7.90
      1005        5.10
      1006        6.42
      1007        4.59
      1008        4.11
      1009        6.68
      1010        6.85
      1011        3.31
      1012        1.88
      5011        4.91

SQL Fiddle.


1 - don't look at the edit history...

Upvotes: 1

Alex Poole
Alex Poole

Reputation: 191275

As @PonderStibbons pointed out, this can be done quite simply by adding together the span of each hired period, and also adding the span of the between-hiring periods if the bridge value is 'Yes'. Open-ended hirings and different numbers of hirings can be handled by treating all null dates as the same, presumably as sysdate if you want the current hiring period to be counted up to today.

You can either calculate as days and divide by a nominal value for the number of days in the year (again as Ponder showed!), or use month subtraction and divide by 12:

select emplid,
  round((months_between(nvl(term_1, sysdate), hire_1)
    + case when bridge_1 = 'Yes' then months_between(hire_2, term_1) else 0 end
    + months_between(nvl(term_2, sysdate), nvl(hire_2, sysdate))
    + case when bridge_2 = 'Yes' then months_between(hire_3, term_2) else 0 end
    + months_between(sysdate, nvl(hire_3, sysdate))) / 12, 2) as total_years
from t42
order by emplid;

    EMPLID TOTAL_YEARS
---------- -----------
      1001        3.74
      1002       12.62
      1003        5.84
      1004        7.90
      1005        5.10
      1006        6.42
      1007        7.21
      1008        4.29
      1009        6.68
      1010        6.85
      1011        3.31
      1012        2.21
      5011       20.22

SQL Fiddle

For all six potential hiring periods, repeat the pattern; only the first and last clauses in the addition will be different, as hire_1 is always present, and term_6 won't exist.


If you only want the most recent contiguous/bridged period, not the total of all contiguous/bridged periods, you need to be more selective about which periods are included - starting from the most recent and working backwards, taking all later bridges into account:

select emplid,
  round((
    -- 3rd period
    months_between(sysdate, nvl(hire_3, sysdate))
      -- gap between 3rd and 2nd
      + case when (hire_3 is not null and bridge_2 = 'Yes') then
          months_between(hire_3, term_2) else 0 end
      -- 2nd period
      + case when (hire_3 is null or bridge_2 = 'Yes') then
          months_between(nvl(term_2, sysdate), nvl(hire_2, sysdate)) else 0 end
      -- gap between 2nd and 1st
      + case when (hire_3 is not null and bridge_2 = 'Yes')
            and (hire_1 is not null and bridge_1 = 'Yes') then
          months_between(hire_2, term_1) else 0 end
      -- 1st period
      + case when (hire_3 is null or bridge_2 = 'Yes')
            and (hire_2 is null or bridge_1 = 'Yes') then
          months_between(nvl(term_1, sysdate), nvl(hire_1, sysdate)) else 0 end
    ) / 12, 2) as total_years
from t42
order by emplid;

    EMPLID TOTAL_YEARS
---------- -----------
      1001        3.74
      1002        7.62
      1003        5.84
      1004        7.90
      1005        5.10
      1006        6.42
      1007        4.59
      1008        4.11
      1009        6.68
      1010        6.85
      1011        3.31
      1012        1.88
      5011        4.91

SQL Fiddle, including a breakdown for each period/gap if it is should be included.

Upvotes: 0

Related Questions