AFJ
AFJ

Reputation: 159

Select Distinct Rows Outside of Time Frame

I am trying to use SQL to select distinct data entries based on the time difference between one entry and the next. It's easier to explain with an example:

My data table has

Part    DateTime   
123     12:00:00
123     12:00:05
123     12:00:06
456     12:10:23
789     12:12:13
123     12:14:32

I would like to return all rows as long with the limitation that if there are multiple entries with the same "Part" number I would like to retrieve only those that have a difference of at least 5 minutes.

The query should return:

Part    DateTime   
123     12:00:00
456     12:10:23
789     12:12:13
123     12:14:32

The code I'm using is the following:

SELECT data1.*, to_char(data1.scan_time, 'yyyymmdd hh24:mi:ss') 

FROM data data1 

where exists
(
    select * 

    from data data2

    where data1.part_serial_number = data2.part_serial_number AND 
    data2.scan_time + 5/1440 >= data1.scan_time 
    and data2.info is null
)

order by to_char(data1.scan_time, 'yyyymmdd hh24:mi:ss'), data1.part_serial_number

This is not working unfortunately. Does anyone know what i'm doing wrong or can suggest an alternate approach??

Thanks

Upvotes: 3

Views: 231

Answers (3)

MPelletier
MPelletier

Reputation: 16697

This has not been verified, but essentially, the trick is to group by part AND time divided by 5 minutes (floored).

select part, min(scan_time)
from data
group by part, floor(scan_time/(5/1440))
order by scan_time;

Upvotes: 0

Justin Cave
Justin Cave

Reputation: 231781

Analytic functions to the rescue.

You can use the analytic function LEAD to get the data for the next row for the part.

SQL> ed
Wrote file afiedt.buf

  1  with x as (
  2    select 123 part, timestamp '2011-12-08 00:00:00' ts
  3      from dual
  4    union all
  5    select 123, timestamp '2011-12-08 00:00:05'
  6      from dual
  7    union all
  8    select 123, timestamp '2011-12-08 00:00:06'
  9      from dual
 10    union all
 11    select 456, timestamp '2011-12-08 00:10:23'
 12      from dual
 13    union all
 14    select 789, timestamp '2011-12-08 00:12:13'
 15      from dual
 16    union all
 17    select 123, timestamp '2011-12-08 00:14:32'
 18      from dual
 19  )
 20  select part,
 21         ts,
 22         lead(ts) over (partition by part order by ts) next_ts
 23*   from x
SQL> /

      PART TS                              NEXT_TS
---------- ------------------------------- -------------------------------
       123 08-DEC-11 12.00.00.000000000 AM 08-DEC-11 12.00.05.000000000 AM
       123 08-DEC-11 12.00.05.000000000 AM 08-DEC-11 12.00.06.000000000 AM
       123 08-DEC-11 12.00.06.000000000 AM 08-DEC-11 12.14.32.000000000 AM
       123 08-DEC-11 12.14.32.000000000 AM
       456 08-DEC-11 12.10.23.000000000 AM
       789 08-DEC-11 12.12.13.000000000 AM

6 rows selected.

Once you've done that, then you can create an inline view and simply select those rows where the next date is more than 5 minutes after the current date.

SQL> ed
Wrote file afiedt.buf

  1  with x as (
  2    select 123 part, timestamp '2011-12-08 00:00:00' ts
  3      from dual
  4    union all
  5    select 123, timestamp '2011-12-08 00:00:05'
  6      from dual
  7    union all
  8    select 123, timestamp '2011-12-08 00:00:06'
  9      from dual
 10    union all
 11    select 456, timestamp '2011-12-08 00:10:23'
 12      from dual
 13    union all
 14    select 789, timestamp '2011-12-08 00:12:13'
 15      from dual
 16    union all
 17    select 123, timestamp '2011-12-08 00:14:32'
 18      from dual
 19  )
 20  select part,
 21         ts
 22    from (
 23      select part,
 24             ts,
 25             lead(ts) over (partition by part order by ts) next_ts
 26        from x )
 27   where next_ts is null
 28*     or next_ts > ts + interval '5' minute
SQL> /

      PART TS
---------- -------------------------------
       123 08-DEC-11 12.00.06.000000000 AM
       123 08-DEC-11 12.14.32.000000000 AM
       456 08-DEC-11 12.10.23.000000000 AM
       789 08-DEC-11 12.12.13.000000000 AM

Upvotes: 3

dani herrera
dani herrera

Reputation: 51715

AFJ,

let's supose that we have a new field that tell us if exists a previus entry for this Part in the previous 5 minutes, then, taking the rows that this field is set to False we have the result.

select  
  Part, 
  DateTime,
  coalesce(
    (select distinct 1
     from data ds
     where ds.Part = d.Part
       and ds.DateTime between d.DateTime and d.DateTime - 5/1440
    )
    , 0) as exists_previous 
from data d

The subquery checks if they are a row with same Part in previous 5 minutes inteval

Result must be:

Part    DateTime   exists_previous
123     12:00:00   0
123     12:00:05   1
123     12:00:06   1
456     12:10:23   0
789     12:12:13   0
123     12:14:32   0

now, filter to get only rows with 0:

   select Part, DateTime from 
   (select  
      Part, 
      DateTime,
      coalesce(
        (select distinct 1
         from data ds
         where ds.Part = d.Part
           and ds.DateTime between d.DateTime and d.DateTime - 5/1440
        )
        , 0) as exists_previous 
    from data D
   ) T where T.exists_previous = 0

Disclaimer: not tested.

Upvotes: 1

Related Questions