Reputation: 4617
I have a table containing many columns, I have to make my selection according to these two columns:
TIME ID
-216 AZA
215 AZA
56 EA
-55 EA
66 EA
-03 AR
03 OUI
-999 OP
999 OP
04 AR
87 AR
The expected output is
TIME ID
66 EA
03 OUI
87 AR
I need to select the rows with no matches. There are rows which have the same ID, and almost the same time but inversed with a little difference. For example the first row with the TIME -216 matches the second record with time 215. I tried to solve it in many ways, but everytime I find myself lost.
Upvotes: 1
Views: 476
Reputation: 1188
First step -- find rows with duplicate IDs. Second step -- filter for rows which are near-inverse duplicates.
First step:
SELECT t1.TIME, t2.TIME, t1.ID FROM mytable t1 JOIN mytable
t2 ON t1.ID = t2.ID AND t1.TIME > t2.TIME;
The second part of the join clause ensures we only get one record for each pair.
Second step:
SELECT t1.TIME,t2.TIME,t1.ID FROM mytable t1 JOIN mytable t2 ON t1.ID = t2.ID AND
t1.TIME > t2.TIME WHERE ABS(t1.TIME + t2.TIME) < 3;
This will produce some duplicate results if eg. (10, FI), (-10, FI) and (11, FI)
are in your table as there are two valid pairs. You can possibly filter these out as follows:
SELECT t1.TIME,MAX(t2.TIME),t1.ID FROM mytable t1 JOIN mytable t2 ON
t1.ID = t2.ID AND t1.TIME > t2.TIME WHERE ABS(t1.TIME + t2.TIME) < 3 GROUP BY
t1.TIME,t1.ID;
But it's unclear which result you want to drop. Hopefully this points you in the right direction, though!
Upvotes: 2
Reputation: 14295
Does this help?
create table #RawData
(
[Time] int,
ID varchar(3)
)
insert into #rawdata ([time],ID)
select -216, 'AZA'
union
select 215, 'AZA'
union
select 56, 'EA'
union
select -55, 'EA'
union
select 66, 'EA'
union
select -03, 'AR'
union
select 03, 'OUI'
union
select -999, 'OP'
union
select 999, 'OP'
union
select 04, 'AR'
union
select 87, 'AR'
union
-- this value added to illustrate that the algorithm does not ignore this value
select 156, 'EA'
--create a copy with an ID to help out
create table #Data
(
uniqueId uniqueidentifier,
[Time] int,
ID varchar(3)
)
insert into #Data(uniqueId,[Time],ID) select newid(),[Time],ID from #RawData
declare @allowedDifference int
select @allowedDifference = 1
--find duplicates with matching inverse time
select *, d1.Time + d2.Time as pairDifference from #Data d1 inner join #Data d2 on d1.ID = d2.ID and (d1.[Time] + d2.[Time] <=@allowedDifference and d1.[Time] + d2.[Time] >= (-1 * @allowedDifference))
-- now find all ID's ignoring these pairs
select [Time],ID from #data
where uniqueID not in (select d1.uniqueID from #Data d1 inner join #Data d2 on d1.ID = d2.ID and (d1.[Time] + d2.[Time] <=3 and d1.[Time] + d2.[Time] >= -3))
Upvotes: 0