Tendero
Tendero

Reputation: 1166

Join table with itself using duplicates

Assume there is a table like this:

id  camp score flag
A2  C3   2     I
A2  C1   1     I
A1  C1   2     I
A2  C2   3     I
A1  C2   1     I
A2  C3   0     O
A1  C1   0     O

I want to detect all the duplicates taking just the columns id and camp. In case these pairs are duplicated, I want to keep just one row that has the score value when the flag is I, and then change the flag to O. The output should be:

id  camp score flag
A2  C1   1     I
A2  C2   3     I
A1  C2   1     I
A2  C3   2     O
A1  C1   2     O

I tried to achieve this using this query:

SELECT
    i.id,
    i.camp,
    i.score,
    COALESCE(o.flag, 'I') AS flag
FROM
    t1 AS i

    LEFT JOIN t1 AS o
    ON i.id= o.id AND i.camp = o.camp

WHERE
    i.flag= 'I'
    AND o.flag= 'O'

I thought this would work, but this returns just two rows:

id  camp score flag
A2  C3   2     O
A1  C1   2     O

Why is this happening? How can I get my desired output without creating a new table (if possible)?

Upvotes: 0

Views: 58

Answers (4)

jarlh
jarlh

Reputation: 44796

Move the o.flag= 'O' condition from the WHERE clause to the ON clause, to get true LEFT JOIN result. (When in WHERE, the LEFT JOIN returns regular INNER JOIN result.)

Your query modified:

SELECT
    i.id,
    i.camp,
    i.score,
    COALESCE(o.flag, 'I') AS flag
FROM
    t1 AS i

    LEFT JOIN t1 AS o
    ON i.id= o.id AND i.camp = o.camp
        AND o.flag= 'O'
WHERE
    i.flag= 'I'

Upvotes: 3

Eray Balkanli
Eray Balkanli

Reputation: 7990

Please find an alternative answer below where temporary tables are employed to get the desired resultset. You can use "NOT EXISTS" keyword here, or to go on your own script, to get the rows you are asking for.

CREATE TABLE #temp (id  VARCHAR(2),camp VARCHAR(2), score INT, flag VARCHAR(1))
INSERT INTO #temp VALUES 
    ('A2',  'C3',   2,     'I'),
    ('A2',  'C1',   1  ,   'I'),
    ('A1',  'C1',   2    , 'I'),
    ('A2',  'C2',   3   ,  'I'),
    ('A1',  'C2',   1   ,  'I'),
    ('A2',  'C3',   0   ,  'O'),
    ('A1',  'C1',   0  ,   'O')

SELECT m1.id,m1.camp,m1.score, flag='O'
INTO #duplicatesWithScore
FROM #temp m1
INNER JOIN (
    SELECT id,camp
    FROM #temp
    GROUP BY id,camp
    HAVING COUNT(*)>1
) m2 ON m1.id=m2.id AND m1.camp=m2.camp
WHERE Score>0

SELECT * 
FROM #temp t
WHERE  NOT EXISTS
(SELECT 1 FROM #duplicatesWithScore d WHERE d.id=t.id AND d.camp=t.camp)

UNION
SELECT * FROM #duplicatesWithScore

Upvotes: 1

Gordon Linoff
Gordon Linoff

Reputation: 1270553

You can express this logic in an aggregation query:

select id, camp,
       (case when count(*) = 1 then max(flag) else 'O' end) as flag,
       (case when count(*) = 1 then max(score)
             else max(case when flag = 'I' then score end)
        end) as score
from t
group by id, camp;

Upvotes: 0

Janine Rawnsley
Janine Rawnsley

Reputation: 1258

Bit of a cheat but would this work?

SELECT id, camp, MAX(score), MAX(flag)
FROM t1
GROUP BY id, camp

Upvotes: 1

Related Questions