Julien S
Julien S

Reputation: 91

SELECT only one pair if I have both (A, B) and (B, A)

Here is a simple question, when I have (A,B) as a result of my query, I would like to have only (A,B), not (B,A).

For exemple my query returns :

161, 52 
161, 53
53, 161
53, 161

Here is my query :

SELECT S1.SURVEY_ID, S2.SURVEY_ID

  FROM SURVEYS S1, SURVEYS S2

  WHERE (S2.START_DATE BETWEEN S1.START_DATE and S1.END_DATE
  OR S2.END_DATE BETWEEN S1.START_DATE and S1.END_DATE)
  AND S1.SURVEY_ID != S2.SURVEY_ID

ORDER BY S1.SURVEY_ID, S2.SURVEY_ID

Upvotes: 1

Views: 1285

Answers (3)

Stefan Steinegger
Stefan Steinegger

Reputation: 64638

Not sure if I really understood the question. Guess: you want that (A,B) is treated as a duplicate of (B,A) and one of them removed.

Bring the values into an order (higher before lower), then you can use distinct:

SELECT distinct 
  (CASE WHEN S1.SURVEY_ID > S2.SURVEY_ID THEN S1.SURVEY_ID ELSE S2.SURVEY_ID END) as "higher id",
  (CASE WHEN S1.SURVEY_ID > S2.SURVEY_ID THEN S2.SURVEY_ID ELSE S1.SURVEY_ID END) as "lower id"
...

Upvotes: 2

Gordon Linoff
Gordon Linoff

Reputation: 1270713

Here is one method:

WITH SS as (
      SELECT S1.SURVEY_ID as SURVEY_ID1, S2.SURVEY_ID as SURVEY_ID2
      FROM SURVEYS S1 JOIN
           SURVEYS S2
           ON (S2.START_DATE BETWEEN S1.START_DATE and S1.END_DATE OR
               S2.END_DATE BETWEEN S1.START_DATE and S1.END_DATE
              ) AND
              S1.SURVEY_ID <> S2.SURVEY_ID
     )
SELECT ss.*
FROM ss
WHERE SURVEY_ID1 < SURVEY_ID2
UNION ALL
SELECT ss.*
FROM ss
WHERE SURVEY_ID1 > SURVEY_ID2 AND
      NOT EXISTS (SELECT 1 FROM ss ss2 WHERE ss2.SURVEY_ID1 = ss.SURVEY_ID2 AND ss2.SURVEY_ID2 = ss.SURVEY_ID1);

However, if you just want overlapping surveys, then this would be the appropriate query:

      SELECT S1.SURVEY_ID as SURVEY_ID1, S2.SURVEY_ID as SURVEY_ID2
      FROM SURVEYS S1 JOIN
           SURVEYS S2
           ON S2.START_DATE <= S1.END_DATE AND
              S2.END_DATE >= S1.START_DATE 
              S1.SURVEY_ID < S2.SURVEY_ID;

Upvotes: 2

sagi
sagi

Reputation: 40491

You can use conditional ordering that will group each set (A,B) , (B,A) as one group .

SELECT MAX(S1.SURVEY_ID) as survey_id_1, MIN(S2.SURVEY_ID) as survey_id_2
FROM SURVEYS S1
INNER JOIN SURVEYS S2
 ON (S2.START_DATE BETWEEN S1.START_DATE and S1.END_DATE
     OR S2.END_DATE BETWEEN S1.START_DATE and S1.END_DATE)
WHERE S1.SURVEY_ID != S2.SURVEY_ID
GROUP BY CASE WHEN S1.SURVEY_ID > S2.SURVEY_ID
              THEN S1.SURVEY_ID
              ELSE S2.SURVEY_ID
         END
ORDER BY survey_id_1, survey_id_2

This will result in the (Bigger survey,Smaller survey) - 1 per group .

I've also changed your join syntax to the proper syntax of a join, please try to avoid the use of implicit join syntax(comma separated) .

Upvotes: 3

Related Questions