Reputation: 1219
I have a table structured like below, that shows when employees are added (operation=I) or removed (operation=D) from an account as a particular role
Account | Employee | Role | Operation | OperationTimestamp
ABC | 1 | Rep | I | 1/1/2018
DEF | 1 | Mgr | I | 1/1/2018
ABC | 1 | Rep | D | 3/31/2018
ABC | 1 | Rep | I | 7/1/2018
ABC | 1 | Rep | D | 12/31/2018
ABC | 2 | Mgr | I | 1/1/2018
DEF | 2 | Exc | I | 1/1/2018
ABC | 2 | Mgr | D | 3/31/2018
ABC | 2 | Mgr | I | 6/1/2018
ABC | 2 | Mgr | D | 10/31/2018
(I = Insert, D = Delete)
I need to develop a query that would return the account, employee, role and date range that the employee was on that account like so:
Account | Employee | Role | StartingDate | EndingDate
ABC | 1 | Rep | 1/1/2018 | 3/31/2018
DEF | 1 | Mgr | 1/1/2018 | NULL
ABC | 1 | Rep | 7/1/2018 | 12/31/2018
ABC | 2 | Mgr | 1/1/2018 | 3/31/2018
DEF | 2 | Exc | 1/1/2018 | NULL
ABC | 2 | Mgr | 6/1/2018 | 10/31/2018
So as you can see from the result set, if an employee is added to an account, but has not been removed, then then EndingDate should be NULL
.
Where I'm struggling is that you can have the same employee added/removed from an account multiple times and/or in multiple roles. My guts says I need to order the transactions by account>employee>role>date and somehow group every 2 lines together (since it should always be an I operation, followed by a D operation) but I'm not sure how to handle the "missing" deletes if they're still on an account.
Upvotes: 0
Views: 58
Reputation: 1270351
I think you just need lead()
or a cumulative min()
. Here is what I mean:
select account, employee, role, OperationTimestamp, EndingDate
from (select t.*,
min(case when operation = 'D' then OperationTimestamp end) over
(partition by account, employee, role
order by OperationTimestamp desc
) as EndingDate
from t
) t
where operation = 'I';
Upvotes: 0
Reputation: 48850
Assumption: An I
operation is never followed by another I
for the same combination (account, employee, role); if there is a next row (it may not be for that combination) it's always a D
.
Data:
create table my_table (
Account varchar(3),
Employee int,
role varchar(3),
Operation varchar(1),
OperationTimestamp datetime
);
insert into my_table values
('ABC',1,'Rep','I','20180101')
,('DEF',1,'Mgr','I','20180101')
,('ABC',1,'Rep','D','20180331')
,('ABC',1,'Rep','I','20180701')
,('ABC',1,'Rep','D','20181231')
,('ABC',2,'Mgr','I','20180101')
,('DEF',2,'Exc','I','20180101')
,('ABC',2,'Mgr','D','20180331')
,('ABC',2,'Mgr','I','20180601')
,('ABC',2,'Mgr','D','20181031');
If the above is true, then I would use the following query:
with
x as (
select
account, employee, role, operationtimestamp, operation,
lead(operation)
over(partition by account, employee, role
order by account, employee, role, operationtimestamp)
as next_op,
lead(operationtimestamp)
over(partition by account, employee, role
order by account, employee, role, operationtimestamp)
as next_ts
from my_table
),
y as(
select
account, employee, role,
operationtimestamp as startingdate,
next_ts as endingdate
from x
where operation = 'I'
)
select *
from y
order by employee, startingdate
Result:
account employee role startingdate endingdate
------- -------- ---- --------------------- ---------------------
ABC 1 Rep 2018-01-01 00:00:00.0 2018-03-31 00:00:00.0
DEF 1 Mgr 2018-01-01 00:00:00.0 <null>
ABC 1 Rep 2018-07-01 00:00:00.0 2018-12-31 00:00:00.0
ABC 2 Mgr 2018-01-01 00:00:00.0 2018-03-31 00:00:00.0
DEF 2 Exc 2018-01-01 00:00:00.0 <null>
ABC 2 Mgr 2018-06-01 00:00:00.0 2018-10-31 00:00:00.0
Upvotes: 1
Reputation: 12243
With a row_number
and a self join
this is fairly straightforward:
declare @t table(Account varchar(3), Employee int, EmpRole varchar(3), Operation varchar(1), OperationTimestamp datetime);
insert into @t values
('ABC',1,'Rep','I','20180101')
,('DEF',1,'Mgr','I','20180101')
,('ABC',1,'Rep','D','20180331')
,('ABC',1,'Rep','I','20180701')
,('ABC',1,'Rep','D','20181231')
,('ABC',2,'Mgr','I','20180101')
,('DEF',2,'Exc','I','20180101')
,('ABC',2,'Mgr','D','20180331')
,('ABC',2,'Mgr','I','20180601')
,('ABC',2,'Mgr','D','20181031');
with d as
(
select Account
,Employee
,EmpRole
,Operation
,OperationTimestamp
,row_number() over (partition by Account, Employee, EmpRole order by OperationTimestamp) as ord
from @t
)
select s.Account
,s.Employee
,s.EmpRole
,s.OperationTimestamp as OperationTimestampStart
,e.OperationTimestamp as OperationTimestampEnd
from d as s
left join d as e
on s.Account = e.Account
and s.Employee = e.Employee
and s.EmpRole = e.EmpRole
and s.ord = e.ord-1
where s.Operation = 'I';
+---------+----------+---------+-------------------------+-----------------------+
| Account | Employee | EmpRole | OperationTimestampStart | OperationTimestampEnd |
+---------+----------+---------+-------------------------+-----------------------+
| ABC | 1 | Rep | 2018-01-01 | 2018-03-31 |
| ABC | 1 | Rep | 2018-07-01 | 2018-12-31 |
| ABC | 2 | Mgr | 2018-01-01 | 2018-03-31 |
| ABC | 2 | Mgr | 2018-06-01 | 2018-10-31 |
| DEF | 1 | Mgr | 2018-01-01 | NULL |
| DEF | 2 | Exc | 2018-01-01 | NULL |
+---------+----------+---------+-------------------------+-----------------------+
Upvotes: 1