Consolidating fact records by timestamp

Question

I am in a Microsoft SQL Server database. I would like to consolidate start and end periods for employees. For example, given the following simple table:

create table dbo.foo (
employee_key bigint,
effective_start_date date,
effective_end_date date,
account_name varchar(100));


insert into foo (employee_key,
effective_start_date,
effective_end_date,
account_name)
values (1
    ,'2017-01-01'
    ,'2017-01-31'
    ,'Google')
,(1
    ,'2017-02-01'
    ,'2017-02-28'
    ,'Apple')
,(1
    ,'2017-03-01'
    ,'2017-03-31'
    ,'Google')
,(1
    ,'2017-04-01'
    ,'9999-12-31'
    ,'Google')

The employee has moved accounts a couple times. I would like to maintain the change in February and back again. However, I do not want to see 2 records at the end. Instead, I want the tenure at Google to run from 2017-03-01 to 9999-12-31.

By the way, this is one example. That is, the "broken" records aren't always at the end of the employee's tenure.

Expected results:

employee_key | effective_start_date | effective_end_date | account_name
1            | 2017-01-01           | 2017-01-31         | Google
1            | 2017-02-01           | 2017-02-28         | Apple
1            | 2017-03-01           | 9999-12-31         | Google

Kannan Kandasamy · Accepted Answer

You can use row_number and get this results... I used two subqueries but it can be solved without that as well:

;With Cte as (
select *, Bucket = sum(accn) over(partition by employee_key order by effective_start_date) 
    from (
        Select *,case when account_name <> lag(account_name) over(partition by employee_key order by effective_start_date) 
            then 1 else 0 end as AccN
            from foo ) A
    )
    Select top (1) with ties employee_key, Min(effective_start_date) over(partition by employee_key,Bucket) AS [effective_start_date],      
                Max(effective_end_date) over(partition by employee_key,Bucket) AS [effective_end_date], account_name
    from cte
    order by row_number() over(partition by employee_key, bucket order by effective_start_date)

Output as below:

+--------------+----------------------+--------------------+--------------+
| employee_key | Effective_start_Date | Effective_End_date | account_name |
+--------------+----------------------+--------------------+--------------+
|            1 | 2017-01-01           | 2017-01-31         | Google       |
|            1 | 2017-02-01           | 2017-02-28         | Apple        |
|            1 | 2017-03-01           | 9999-12-31         | Google       |
+--------------+----------------------+--------------------+--------------+

Demo

Consolidating fact records by timestamp

Answers (2)

Related Questions