Mansuro
Mansuro

Reputation: 4617

difficult sql query

I have a table containing many columns, I have to make my selection according to these two columns:

TIME   ID
-216   AZA
 215   AZA
  56   EA
 -55   EA
  66   EA
 -03   AR
  03   OUI
-999   OP
 999   OP
  04   AR
  87   AR

The expected output is

 TIME   ID
  66   EA
  03   OUI
  87   AR

I need to select the rows with no matches. There are rows which have the same ID, and almost the same time but inversed with a little difference. For example the first row with the TIME -216 matches the second record with time 215. I tried to solve it in many ways, but everytime I find myself lost.

Upvotes: 1

Views: 476

Answers (2)

bucko
bucko

Reputation: 1188

First step -- find rows with duplicate IDs. Second step -- filter for rows which are near-inverse duplicates.

First step:

SELECT t1.TIME, t2.TIME, t1.ID FROM mytable t1 JOIN mytable
 t2 ON t1.ID = t2.ID AND t1.TIME > t2.TIME;

The second part of the join clause ensures we only get one record for each pair.

Second step:

SELECT t1.TIME,t2.TIME,t1.ID FROM mytable t1 JOIN mytable t2 ON t1.ID = t2.ID AND
 t1.TIME > t2.TIME WHERE ABS(t1.TIME + t2.TIME) < 3;

This will produce some duplicate results if eg. (10, FI), (-10, FI) and (11, FI) are in your table as there are two valid pairs. You can possibly filter these out as follows:

 SELECT t1.TIME,MAX(t2.TIME),t1.ID FROM mytable t1 JOIN mytable t2 ON
 t1.ID = t2.ID AND t1.TIME > t2.TIME WHERE ABS(t1.TIME + t2.TIME) < 3 GROUP BY 
 t1.TIME,t1.ID;

But it's unclear which result you want to drop. Hopefully this points you in the right direction, though!

Upvotes: 2

Nat
Nat

Reputation: 14295

Does this help?

create table #RawData
(
    [Time] int,
    ID varchar(3)
)

insert into #rawdata ([time],ID)
select -216,   'AZA'
union
select 215,   'AZA' 
union
select 56,   'EA' 
union
select -55,   'EA' 
union
select 66,   'EA' 
union
select -03,   'AR' 
union
select 03,   'OUI' 
union
select -999,   'OP' 
union
select 999,   'OP' 
union
select 04,   'AR' 
union
select 87,   'AR' 
union
-- this value added to illustrate that the algorithm does not ignore this value
select 156,   'EA' 

--create a copy with an ID to help out
create table #Data
(
    uniqueId uniqueidentifier,
    [Time] int,
    ID varchar(3)
)

insert into #Data(uniqueId,[Time],ID) select newid(),[Time],ID from #RawData
declare @allowedDifference int
select @allowedDifference = 1
--find duplicates with matching inverse time
select *, d1.Time + d2.Time as pairDifference from #Data d1 inner join #Data d2 on d1.ID = d2.ID and (d1.[Time] + d2.[Time] <=@allowedDifference and d1.[Time] + d2.[Time] >= (-1 * @allowedDifference))

-- now find all ID's ignoring these pairs
select [Time],ID from #data 
where uniqueID not in (select d1.uniqueID from #Data d1 inner join #Data d2 on d1.ID = d2.ID and (d1.[Time] + d2.[Time] <=3 and d1.[Time] + d2.[Time] >= -3))

Upvotes: 0

Related Questions