D. Manka
D. Manka

Reputation: 63

sql select query in same table

i have a newby sql related question.

let's say i have this simple table:

A      B
------ ------
a      b
a      c
b      a
m      n
m      o
n      m

i'd like to query for records which have "counterparts" only, i.e. i want to get a b only if there is b a in the table but i want to skip the "back links" (which is b ahere). to sum up i'd like to get the following result

A       B
------  ------
a       b
m       n

this sql query does not work since in the case b a is handled the a b is removed from my result set.

SELECT DISTINCT x1.A, x1.B
FROM TEST x1, TEST x2
WHERE x1.A = x2.B AND x1.B = x2.A -- all records /w counterparts only
AND x1.A NOT IN (SELECT B from TEST where B = x1.A) -- skip the "back links"

the 2nd part of the WHERE clause does not work as desired.

do you have any hints? any help with this would be greatly appreciated.

regards peter

p.s. i am using the derby db.

Upvotes: 6

Views: 8648

Answers (7)

Brahmnik
Brahmnik

Reputation: 1

Why do you even need tab2. This should work based on your desired result.

select 
  distinct tab1.ID, 
  case when tab1.A < tab1.B then tab1.A else tab1.B end as A,
  case when tab1.A > tab1.B then tab1.A else tab1.B end as B  
from TEST tab1;

Upvotes: 0

fthiella
fthiella

Reputation: 49049

SELECT Distinct
  case when tab1.A < tab1.B then tab1.A else tab1.B end as A,
  case when tab1.A > tab1.B then tab1.A else tab1.B end as B
FROM
  tab inner join tab tab1 on tab.B = tab1.A
WHERE
  tab1.B = tab.A

EDIT: based on your updated answer, i think you need this:

select distinct
  (case when tab1.A < tab1.B then tab1.A else tab1.B end) as A,
  (case when tab1.A > tab1.B then tab1.A else tab1.B end) as B
from TEST tab1 left join TEST tab2 on tab1.B = tab2.A

it gives the same result as your query, but it's much faster since there's no cartesian join.

Upvotes: 1

D. Manka
D. Manka

Reputation: 63

below you will find the sql code to create my table along with the inserted data

CREATE TABLE TEST (A varchar(4), B varchar(4));
INSERT INTO TEST (ID,A,B) VALUES ('1','d','a');
INSERT INTO TEST (ID,A,B) VALUES ('1','c','a');
INSERT INTO TEST (ID,A,B) VALUES ('1','b','a');
INSERT INTO TEST (ID,A,B) VALUES ('1','a','xxx');
INSERT INTO TEST (ID,A,B) VALUES ('1','a','d');
INSERT INTO TEST (ID,A,B) VALUES ('1','a','c');
INSERT INTO TEST (ID,A,B) VALUES ('1','a','b');
INSERT INTO TEST (ID,A,B) VALUES ('2','g','a');
INSERT INTO TEST (ID,A,B) VALUES ('2','a','g');
INSERT INTO TEST (ID,A,B) VALUES ('3','f','b');
INSERT INTO TEST (ID,A,B) VALUES ('3','b','f');
INSERT INTO TEST (ID,A,B) VALUES ('4','s','r');
INSERT INTO TEST (ID,A,B) VALUES ('4','r','s');
INSERT INTO TEST (ID,A,B) VALUES ('5','r','t');
INSERT INTO TEST (ID,A,B) VALUES ('7','h','g');

as described before, using this query:

select distinct tab1.ID,
   ( case when tab1.A < tab1.B then tab1.A else tab1.B end ) as A,
   ( case when tab1.A > tab1.B then tab1.A else tab1.B end) as B
from TEST tab1, TEST tab2

... i got the desired result:

ID  A  B
--  -- --
1   a  b
1   a  c
1   a  d
1   a  xxx
2   a  g
3   b  f
4   r  s
5   r  t
7   g  h

sorry guys, maybe i miss here something but it seems your solutions still does not work as intended.

@fthiella: i tested your solution:

SELECT tab1.*
FROM TEST tab1 LEFT JOIN TEST tab2 on tab1.B=tab2.A
WHERE tab1.A<tab1.B OR tab2.A is null

result (a/b duplicate, g/h missing):

ID  A  B
--  -- --
1   a  b
1   a  b
1   a  c
1   a  xxx
2   a  g
3   b  f
4   r  s
5   r  t

@wildplasser: it seems this solution does not work, too

SELECT * FROM TEST t
WHERE EXISTS (
    SELECT * FROM TEST x
    WHERE x.a = t.b AND x.b = t.a
    AND x.a > x.b -- tie breaker
    );

result (a/xxx and r/t are missing):

ID  A  B
--  -- --
1   a  b
1   a  c
1   a  d
2   a  g
3   b  f
4   r  s

Upvotes: 0

wildplasser
wildplasser

Reputation: 44250

SELECT *
FROM ztable t
WHERE EXISTS (
    SELECT * FROM ztable x
    WHERE x.a = t.b AND x.b = t.a
    AND x.a > x.b -- tie breaker
    );

The exists has the advantage that the (corellated) subquery is not visible to the outside query; so the select * will expand to only the columns of table t.

Upvotes: 0

D. Manka
D. Manka

Reputation: 63

thanks for all answers so far

i've got a slightly modified (easier) version of my first problem. i don't need the check on "counterpart" rows, i only need to skip the back references.

i modified fthiellas solution so far (see below) and it works. somehow, i think there must be a simplier solution.

select distinct
( case when tab1.A < tab1.B then tab1.A else tab1.B end ) as A,
( case when tab1.A > tab1.B then tab1.A else tab1.B end) as B
from TEST tab1, TEST tab2

since i am not restricted to derby, i wonder if there will be a problem with the CASE function when switching to another db system. a 'universal' solution would be nice which works with all the different sql dialects (mysql, postgres, oracle, mssql, etc.)

any ideas?

Upvotes: 0

JAQFrost
JAQFrost

Reputation: 1431

You could change your last line to:

AND x1.A < x1.B

This assumes that either your columns are never self-referential (eg: a, a) or that you don't want circular references to appear. If you do, then:

AND x1.A <= x1.B

EDIT:

You're probably better off using explicit joins as well:

SELECT DISTINCT 
    x1.A
    , x1.B
FROM 
    TEST x1
JOIN 
    TEST x2
    ON x1.A = x2.B 
        AND x1.B = x2.A -- all records /w counterparts only
WHERE x1.A < x1.B --Skip backreference

Upvotes: 6

il_guru
il_guru

Reputation: 8508

I do not have a derby db to test on but i think this should work.
As you did not specify how we can decide which one between a/b and b/a is the back link i used the first occurence found to be the correct direction.

The idea behind the query is joining together the same table to get the back link and a row_number representing the position of the found item. Then joining together the tables with the position and taking the one with the first occurence found.

select TOT1.A, TOT2.B
(select distinct t1.A, t1.B, row_number() over() as num
from test t1
join test t2
on t1.A = t2.B and t1.B = t2.A) as TOT1
join
(select distinct t1.A, t1.B, row_number() over() as num
from test t1
join test t2
on t1.A = t2.B and t1.B = t2.A) as TOT2
on TOT1.A = TOT2.B and TOT1.B = TOT2.A and TOT1.NUM < TOT2.NUM

Upvotes: 0

Related Questions