PyQL
PyQL

Reputation: 1830

Find duplicates within a specific period

I have a table with the following structure

ID  Person  LOG_TIME
-----------------------------------
1   1       2012-05-21 13:03:11.550
2   1       2012-05-22 13:09:37.050 <--- this is duplicate
3   1       2012-05-28 13:09:37.183
4   2       2012-05-20 15:09:37.230
5   2       2012-05-22 13:03:11.990 <--- this is duplicate
6   2       2012-05-24 04:04:13.222 <--- this is duplicate
7   2       2012-05-29 11:09:37.240

I have some application job that fills this table with data.

There is a business rule that each person should have only 1 record in every 7 days.

From the above example, records # 2,5 and 6 are considered duplicates while 1,3,4 and 7 are OK.

I want to have a SQL query that checks if there are records for the same person in less than 7 days.

Upvotes: 2

Views: 2880

Answers (4)

declare @Count int

set @count=(
select COUNT(*) 
from timeslot  
where (( (TimeFrom<@Timefrom and TimeTo >@Timefrom) 
      or (TimeFrom<@Timeto and TimeTo >@Timeto))
      or (TimeFrom=@Timefrom or TimeTo=@Timeto)))

Upvotes: 0

Oleksandr Fedorenko
Oleksandr Fedorenko

Reputation: 16904

;WITH cte AS
 (
  SELECT ID, Person, LOG_TIME,
         DATEDIFF(d, MIN(LOG_TIME) OVER (PARTITION BY Person), LOG_TIME) AS diff_date
  FROM dbo.Log_time
  )
SELECT *
FROM cte
WHERE diff_date BETWEEN 1 AND 6

Demo on SQLFiddle

Upvotes: 1

Tim S
Tim S

Reputation: 707

Please see my attempt on SQLFiddle here.

You can use a join based on DATEDIFF() to find records which are logged less than 7 days apart:

WITH TooClose
AS
(
    SELECT
        a.ID AS BeforeID,
        b.ID AS AfterID
    FROM
        Log a
        INNER JOIN Log b ON a.Person = b.Person
                        AND a.LOG_TIME < b.LOG_TIME
                        AND DATEDIFF(DAY, a.LOG_TIME, b.LOG_TIME) < 7
)

However, this will include records which you don't consider "duplicates" (for instance, ID 3, because it is too close to ID 2). From what you've said, I'm inferring that a record isn't a "duplicate" if the record it is too close to is itself a "duplicate".

So to apply this rule and get the final list of duplicates:

SELECT
    AfterID AS ID
FROM
    TooClose
WHERE
    BeforeID NOT IN (SELECT AfterID FROM TooClose)

Upvotes: 2

bonCodigo
bonCodigo

Reputation: 14361

Please take a look at this sample.

Reference: SQLFIDDLE

Query:

select person, 
datediff(max(log_time),min(log_time)) as diff,
count(log_time)
 from pers 
 group by person
;

select y.person, y.ct
from (
select person, 
datediff(max(log_time),min(log_time)) as diff,
count(log_time) as ct
 from pers 
 group by person) as y
where y.ct > 1
and y.diff <= 7
;


PERSON  DIFF    COUNT(LOG_TIME)
1   1   3
2   8   3


PERSON  CT
1   3

Upvotes: 0

Related Questions