Richard F.
Richard F.

Reputation: 67

Delete rows in single table in SQL Server where timestamp column differs

I have a table of employee timeclock punches that looks something like this:

| EmployeeID | PunchDate  | PunchTime | PunchType | Sequence |
|------------|------------|-----------|-----------|----------|
|       5386 | 12/27/2016 | 03:57:42  | On Duty   |      552 |
|       5386 | 12/27/2016 | 09:30:00  | Off Duty  |      563 |
|       5386 | 12/27/2016 | 010:02:00 | On Duty   |      564 |
|       5386 | 12/27/2016 | 12:10:00  | Off Duty  |      570 |
|       5386 | 12/27/2016 | 12:22:00  | On Duty   |      571 |
|       5386 | 12/27/2016 | 05:13:32  | Off Duty  |      578 |  

What I need to do is delete any rows where the difference in minutes between an Off Duty punch and the following On Duty punch is less than, say, 25 minutes. In the example above, I would want to remove Sequence 570 and 571.

I'm already creating this table by pulling all Off Duty punches from another table and using this query to pull all On Duty punches that follow an Off Duty punch:

SELECT *  FROM [dbo].[Punches]
INSERT INTO [dbo].[UpdatePunches (EmployeeID,PunchDate,PunchTime,PunchType,Sequence)
SELECT *  FROM [dbo].[Punches]
WHERE Sequence IN (
SELECT Sequence + 1
FROM [dbo].[Punches]
WHERE PunchType LIKE 'Off Duty%') AND
PunchType LIKE 'On Duty%'   

I have been trying to fit some sort of DATEDIFF query both in this code and as a separate step to weed these out, but have not had any luck. I can't use specific Sequence numbers because those are going to change for every punch.

I'm using SQL Server 2008.

Any suggestions would be much appreciated.

Upvotes: 1

Views: 845

Answers (4)

pizzaslice
pizzaslice

Reputation: 478

Maybe something like this would be easy to slap in there.. This simply uses a subquery to find the next 'on duty' punch and compare it in the main query to the 'off duty' punch.

          Delete
          FROM [dbo].[Punches] p
          where p.PunchTime >=
          dateadd(minute, -25, isnull (
   (select top 1 p2.PunchTime from [dbo].[Punches] p2 where 
   p2.EmployeeID=p.EmployeeID and p2.PunchType='On Duty' 
   and p1.Sequence < p2.Sequence and p2.PunchDate=p.PunchDate
   order by p2.Sequence asc)
   ),'2500-01-01')
          and p.PunchType='Off Duty'

Upvotes: 1

Vamsi Prabhala
Vamsi Prabhala

Reputation: 49260

You can assign rownumbers per employee based on punchdate and punchtime and join each row with the next based on ascending order of date and time.

Thereafter, get the rownumbers of those rows where the difference is less than 25 minutes and finally delete those rows.

with rownums as 
(select t.*,row_number() over(partition by employeeid 
                              order by cast(punchdate +' '+punchtime as datetime) ) as rn
 from t)
,rownums_to_delete as 
(
 select r1.rn,r1.employeeid
 from rownums r1
 join rownums r2 on r1.employeeid=r2.employeeid and r1.rn=r2.rn+1
 where dateadd(minute,25,cast(r2.punchdate +' '+r2.punchtime as datetime)) > cast(r1.punchdate +' '+r1.punchtime as datetime)
 and r1.punchtype <> r2.punchtype
 union all
 select r2.rn, r2.employeeid
 from rownums r1
 join rownums r2 on r1.employeeid=r2.employeeid and r1.rn=r2.rn+1
 where dateadd(minute,25,cast(r2.punchdate +' '+r2.punchtime as datetime)) > cast(r1.punchdate +' '+r1.punchtime as datetime)
 and r1.punchtype <> r2.punchtype
)
delete r
from rownums_to_delete rd
join rownums r on rd.employeeid=r.employeeid and r.rn=rd.rn

Sample Demo

If date and time columns are not varchar but actual date and time datatype, use punchdate+punchtime in the query.

Edit: An easier version of the query would be

with todelete as (
select t1.employeeid,cast(t2.punchdate+' '+t2.punchtime as datetime) as punchtime,
t2.punchtype,t2.sequence,
cast(t1.punchdate+' '+t1.punchtime as datetime) next_punchtime, 
t1.punchtype as next_punchtype,t1.sequence as next_sequence
from t t1
join t t2 on t1.employeeid=t2.employeeid 
and cast(t2.punchdate+' '+t2.punchtime as datetime) between dateadd(minute,-25,cast(t1.punchdate+' '+t1.punchtime as datetime)) and cast(t1.punchdate+' '+t1.punchtime as datetime) 
where t2.punchtype <> t1.punchtype
    )
delete t
from t 
join todelete td on t.employeeid = td.employeeid 
and cast(t.punchdate+' '+t.punchtime as datetime) in (td.punchtime,td.next_punchtime)
;

Upvotes: 2

Gordon Linoff
Gordon Linoff

Reputation: 1270463

SQL Server has a nice ability called updatable CTEs. Using lead() and lag(), you can do exactly what you want. The following assumes that the date is actually stored as a datetime -- this is just for the convenience of adding the date and time together (you can also explicitly use conversion):

with todelete as (
      select tcp.*,
             (punchdate + punchtime) as punchdatetime.
             lead(punchtype) over (partition by employeeid order by punchdate, punchtime) as next_punchtype,
             lag(punchtype) over (partition by employeeid order by punchdate, punchtime) as prev_punchtype,
             lead(punchdate + punchtime) over (partition by employeeid order by punchdate, punchtime) as next_punchdatetime,
             lag(punchdate + punchtime) over (partition by employeeid order by punchdate, punchtime) as prev_punchdatetime
      from timeclockpunches tcp
     )
delete from todelete
    where (punchtype = 'Off Duty' and
           next_punchtype = 'On Duty' and
           punchdatetime > dateadd(minute, -25, next_punchdatetime)
          ) or
          (punchtype = 'On Duty' and
           prev_punchtype = 'Off Duty' and
           prev_punchdatetime > dateadd(minute, -25, punchdatetime)
          );

EDIT:

In SQL Server 2008, you can do use the same idea, just not as efficiently:

delete t
    from t outer apply
         (select top 1 tprev.*
          from t tprev
          where tprev.employeeid = t.employeeid and
                (tprev.punchdate < t.punchdate or
                 (tprev.punchdate = t.punchdate and tprev.punchtime < t.punchtime)
                )
          order by tprev.punchdate desc, tprev.punchtime desc
         ) tprev outer apply
         (select top 1 tnext.*
          from t tnext
          where tnext.employeeid = t.employeeid and
                (t.punchdate < tnext.punchdate or
                 (t.punchdate = tnext.punchdate and t.punchtime < tnext.punchtime)
                )
          order by tnext.punchdate desc, tnext.punchtime desc
         ) tnext
where (t.punchtype = 'Off Duty' and
       tnext.punchtype = 'On Duty' and
       t.punchdatetime > dateadd(minute, -25, tnext.punchdatetime)
      ) or
      (t.punchtype = 'On Duty' and
       tprev.punchtype = 'Off Duty' and
       tprev.punchdatetime > dateadd(minute, -25, t.punchdatetime)
      );

Upvotes: 2

Mike
Mike

Reputation: 550

You could create a DateTime from the Date and Time fields in a CTE and then lookup the next On Duty Time after the Off Duty Time like below:

;
WITH OnDutyDateTime AS
(
    SELECT 
    EmployeeID,
    Sequence,
    DutyDateTime = DATEADD(ms, DATEDIFF(ms, '00:00:00', PunchTime), CONVERT(DATETIME, PunchDate)) 
    FROM
    #TempEmployeeData 
    where PunchType = 'On Duty'
),
OffDutyDateTime As
(
    SELECT 
    EmployeeID,
    Sequence,
    DutyDateTime = DATEADD(ms, DATEDIFF(ms, '00:00:00', PunchTime), CONVERT(DATETIME, PunchDate)) 
    FROM
    #TempEmployeeData 
    where PunchType = 'Off Duty'
)

SELECT 
    OffDutyDateTime = DutyDateTime,
    OnDutyDateTime = (SELECT TOP 1 DutyDateTime FROM OnDutyDateTime WHERE EmployeeID = A.EmployeeID AND Sequence > A.Sequence ORDER BY Sequence ASC ),
    DiffInMinutes = DATEDIFF(minute,DutyDateTime,(SELECT TOP 1 DutyDateTime FROM OnDutyDateTime WHERE EmployeeID = A.EmployeeID AND Sequence > A.Sequence ORDER BY Sequence ASC ))

FROM
OffDutyDateTime A


OffDutyDateTime         OnDutyDateTime          DiffInMinutes
----------------------- ----------------------- -------------
2016-12-27 09:30:00.000 2016-12-27 10:02:00.000 32
2016-12-27 12:10:00.000 2016-12-27 12:22:00.000 12
2016-12-28 05:13:32.000 NULL                    NULL

(3 row(s) affected)

Upvotes: 1

Related Questions