Reputation: 63
i have a newby sql related question.
let's say i have this simple table:
A B
------ ------
a b
a c
b a
m n
m o
n m
i'd like to query for records which have "counterparts" only, i.e. i want to get a b
only if there is b a
in the table but i want to skip the "back links" (which is b a
here).
to sum up i'd like to get the following result
A B
------ ------
a b
m n
this sql query does not work since in the case b a
is handled the a b
is removed from my result set.
SELECT DISTINCT x1.A, x1.B
FROM TEST x1, TEST x2
WHERE x1.A = x2.B AND x1.B = x2.A -- all records /w counterparts only
AND x1.A NOT IN (SELECT B from TEST where B = x1.A) -- skip the "back links"
the 2nd part of the WHERE clause does not work as desired.
do you have any hints? any help with this would be greatly appreciated.
regards peter
p.s. i am using the derby db.
Upvotes: 6
Views: 8648
Reputation: 1
Why do you even need tab2. This should work based on your desired result.
select
distinct tab1.ID,
case when tab1.A < tab1.B then tab1.A else tab1.B end as A,
case when tab1.A > tab1.B then tab1.A else tab1.B end as B
from TEST tab1;
Upvotes: 0
Reputation: 49049
SELECT Distinct
case when tab1.A < tab1.B then tab1.A else tab1.B end as A,
case when tab1.A > tab1.B then tab1.A else tab1.B end as B
FROM
tab inner join tab tab1 on tab.B = tab1.A
WHERE
tab1.B = tab.A
EDIT: based on your updated answer, i think you need this:
select distinct
(case when tab1.A < tab1.B then tab1.A else tab1.B end) as A,
(case when tab1.A > tab1.B then tab1.A else tab1.B end) as B
from TEST tab1 left join TEST tab2 on tab1.B = tab2.A
it gives the same result as your query, but it's much faster since there's no cartesian join.
Upvotes: 1
Reputation: 63
below you will find the sql code to create my table along with the inserted data
CREATE TABLE TEST (A varchar(4), B varchar(4));
INSERT INTO TEST (ID,A,B) VALUES ('1','d','a');
INSERT INTO TEST (ID,A,B) VALUES ('1','c','a');
INSERT INTO TEST (ID,A,B) VALUES ('1','b','a');
INSERT INTO TEST (ID,A,B) VALUES ('1','a','xxx');
INSERT INTO TEST (ID,A,B) VALUES ('1','a','d');
INSERT INTO TEST (ID,A,B) VALUES ('1','a','c');
INSERT INTO TEST (ID,A,B) VALUES ('1','a','b');
INSERT INTO TEST (ID,A,B) VALUES ('2','g','a');
INSERT INTO TEST (ID,A,B) VALUES ('2','a','g');
INSERT INTO TEST (ID,A,B) VALUES ('3','f','b');
INSERT INTO TEST (ID,A,B) VALUES ('3','b','f');
INSERT INTO TEST (ID,A,B) VALUES ('4','s','r');
INSERT INTO TEST (ID,A,B) VALUES ('4','r','s');
INSERT INTO TEST (ID,A,B) VALUES ('5','r','t');
INSERT INTO TEST (ID,A,B) VALUES ('7','h','g');
as described before, using this query:
select distinct tab1.ID,
( case when tab1.A < tab1.B then tab1.A else tab1.B end ) as A,
( case when tab1.A > tab1.B then tab1.A else tab1.B end) as B
from TEST tab1, TEST tab2
... i got the desired result:
ID A B
-- -- --
1 a b
1 a c
1 a d
1 a xxx
2 a g
3 b f
4 r s
5 r t
7 g h
sorry guys, maybe i miss here something but it seems your solutions still does not work as intended.
@fthiella: i tested your solution:
SELECT tab1.*
FROM TEST tab1 LEFT JOIN TEST tab2 on tab1.B=tab2.A
WHERE tab1.A<tab1.B OR tab2.A is null
result (a/b
duplicate, g/h
missing):
ID A B
-- -- --
1 a b
1 a b
1 a c
1 a xxx
2 a g
3 b f
4 r s
5 r t
@wildplasser: it seems this solution does not work, too
SELECT * FROM TEST t
WHERE EXISTS (
SELECT * FROM TEST x
WHERE x.a = t.b AND x.b = t.a
AND x.a > x.b -- tie breaker
);
result (a/xxx
and r/t
are missing):
ID A B
-- -- --
1 a b
1 a c
1 a d
2 a g
3 b f
4 r s
Upvotes: 0
Reputation: 44250
SELECT *
FROM ztable t
WHERE EXISTS (
SELECT * FROM ztable x
WHERE x.a = t.b AND x.b = t.a
AND x.a > x.b -- tie breaker
);
The exists has the advantage that the (corellated) subquery is not visible to the outside query; so the select *
will expand to only the columns of table t.
Upvotes: 0
Reputation: 63
thanks for all answers so far
i've got a slightly modified (easier) version of my first problem. i don't need the check on "counterpart" rows, i only need to skip the back references.
i modified fthiellas solution so far (see below) and it works. somehow, i think there must be a simplier solution.
select distinct
( case when tab1.A < tab1.B then tab1.A else tab1.B end ) as A,
( case when tab1.A > tab1.B then tab1.A else tab1.B end) as B
from TEST tab1, TEST tab2
since i am not restricted to derby, i wonder if there will be a problem with the CASE function when switching to another db system. a 'universal' solution would be nice which works with all the different sql dialects (mysql, postgres, oracle, mssql, etc.)
any ideas?
Upvotes: 0
Reputation: 1431
You could change your last line to:
AND x1.A < x1.B
This assumes that either your columns are never self-referential (eg: a, a) or that you don't want circular references to appear. If you do, then:
AND x1.A <= x1.B
EDIT:
You're probably better off using explicit joins as well:
SELECT DISTINCT
x1.A
, x1.B
FROM
TEST x1
JOIN
TEST x2
ON x1.A = x2.B
AND x1.B = x2.A -- all records /w counterparts only
WHERE x1.A < x1.B --Skip backreference
Upvotes: 6
Reputation: 8508
I do not have a derby db to test on but i think this should work.
As you did not specify how we can decide which one between a/b and b/a is the back link i used the first occurence found to be the correct direction.
The idea behind the query is joining together the same table to get the back link and a row_number representing the position of the found item. Then joining together the tables with the position and taking the one with the first occurence found.
select TOT1.A, TOT2.B
(select distinct t1.A, t1.B, row_number() over() as num
from test t1
join test t2
on t1.A = t2.B and t1.B = t2.A) as TOT1
join
(select distinct t1.A, t1.B, row_number() over() as num
from test t1
join test t2
on t1.A = t2.B and t1.B = t2.A) as TOT2
on TOT1.A = TOT2.B and TOT1.B = TOT2.A and TOT1.NUM < TOT2.NUM
Upvotes: 0