Reputation: 31

Select distinct combination of rows based on columns

I have a table where the columns author1 and author2 have mirror data like (a,b) and (b,a). But I don't want the data to be redundant so a SQL query is needed for eliminating records that have (b,a) if (a,b) already exists.

author1              author2                No of Publications
Anna Spagnolli       Luciano Gamberini           115
Luciano Gamberini    Anna Spagnolli              115
Anna Spagnolli       Giuseppe Riva                66
Giuseppe Riva        Anna Spagnolli               66
Giuseppe Riva        Luciano Gamberini            60
Luciano Gamberini    Giuseppe Riva                60
Achim Jung           Anna Spagnolli               33
Anna Spagnolli       Achim Jung                   33
Anna Spagnolli       John A. Waterworth           33
John A. Waterworth   Anna Spagnolli               33

For eg, I don't want the second record here because it already exists in the first record

Upvotes: 2

Answers (4)

sumo

Reputation: 26

CREATE TABLE #TEMI1(AUTHOR1 CHAR(3),AUTHOR2 CHAR(3),NO_OF_PUB INT)

INSERT INTO #TEMI1 VALUES
('A','B',115),('B','A',115),('A','C',66),('C','A',66),('D','A',78),('B','C',115)

WITH CTE1 AS(
SELECT *,
ROW_NUMBER() OVER(PARTITION BY NO_OF_PUB ORDER BY NO_OF_PUB) RNUM FROM #TEMI1 )

--SELECT * FROM CTE1 C1
--JOIN CTE1 C2
--ON C1.NO_OF_PUB=C2.NO_OF_PUB AND C1.AUTHOR1=C2.AUTHOR2 AND C1.AUTHOR2=C2.AUTHOR1


DELETE C1 FROM CTE1 C1 
JOIN CTE1 C2
ON C1.NO_OF_PUB=C2.NO_OF_PUB AND C1.AUTHOR1=C2.AUTHOR2 AND C1.AUTHOR2=C2.AUTHOR1
WHERE C1.RNUM>1

Upvotes: 0

Nishant Gupta

Reputation: 3656

Solution for your problem :

SELECT table1.author1, table1.author2, table1.No_Of_Publications FROM table1
LEFT JOIN table1 as tbl
ON table1.author1 = tbl.author2
AND table1.author2 = tbl.author1
WHERE table1.author1 < tbl.author1
OR tbl.author1 IS NULL

For demo:

http://sqlfiddle.com/#!9/2263fa/14

Upvotes: 2

uhs

Reputation: 848

Try this https://www.db-fiddle.com/f/buQqnt1YsfoWCuJoMdgVrz/0

CREATE TABLE table1 (
  author1 VARCHAR(100),
  author2 VARCHAR(100),
  publication INT
);

insert into table1 values ('a', 'b', 100);

insert into table1 values ('b', 'a', 100);

select   (case
  when t.author1>t.author2 then t.author1 else t.author2  END ) as a1 ,
   (case
  when t.author1>t.author2 then t.author2 else t.author1  END ) as a2 , publication

 from table1 as t group by a1, a2, publication;

Upvotes: 0

Gaurav

Reputation: 1109

Please try to use below query :-

select (case when author1> author2 then author2 else author1 end) as author1
      ,(case when author1> author2 then author1 else author2 end) as author2
      ,publications 
from calc
group by (case when author1> author2 then author2 else author1 end)
,(case when author1> author2 then author1 else author2 end)
,publications

Upvotes: 1

Select distinct combination of rows based on columns

Answers (4)

Related Questions