Dillinger
Dillinger

Reputation: 341

SELECT all columns from distinct two columns

I need to SELECT * where distinct 'Rua' with different 'CP' in the following DB:

id      Rua                  Local           Conc       CP
81143   dos moinhos          Rio Tinto       Gondomar   0123
81142   dos Moinhos          Rio Tinto       Gondomar   4435
81141   dos Moinhos          Rio Tinto       Gondomar   4435

With the following query, i am able to obtain the two columns:

SELECT Rua, CP 
FROM  Codigo
GROUP BY Rua, CP
HAVING COUNT(*) = 1

But i want all the columns. SELECT * returns "Column 'Codigo.id' is invalid in the select list because it is not contained in either an aggregate function or the GROUP BY clause."

So i came upon this answer: How do I (or can I) SELECT DISTINCT on multiple columns?

I didn't go for the accepted answer, as i need a fast query (this will be used for AJAX search suggestions). I used the other highly upvoted answer. So, i created the following query:

SELECT * From Codigo
WHERE  (Rua,CP) IN (
   SELECT Rua, CP
   FROM   Codigo
   GROUP  BY Rua, CP
   HAVING COUNT(*) = 1 
);

That returns An expression of non-boolean type specified in a context where a condition is expected, near ',' error. Why it expects Rua after WHERE to be boolean? In the answer they used saleprice which don't seem boolean.

So my question is how can i select all columns including the rows with same Rua and CP, but those only once (DISTINCT)?

Upvotes: 3

Views: 11744

Answers (2)

Pரதீப்
Pரதீப்

Reputation: 93694

SQL Server does not support (a,b) in (val1,val2) syntax

Use EXISTS to check the pair

SELECT *
FROM   Codigo C1
WHERE  EXISTS (SELECT 1
               FROM   Codigo C2
               WHERE  C1.Rua = C2.Rua
                      AND C1.CP = C2.CP
               HAVING Count(*) = 1); 

or use COUNT() Over() Window aggregate function to count the records for each Rua, CP combination and filter the group which is having count as 1 to find one Rua row per each different CP.

SELECT *
FROM   (SELECT *,
               Count(1)OVER(partition BY Rua, CP) AS cnt
        FROM   Codigo C1) A
WHERE  cnt = 1 

This will be my preferred approach since it is much efficient than using EXISTS the optimizer has to scan/seek the table only once

If you don't want duplicate records then use ROW_NUMBER()

SELECT *
FROM   (SELECT *,
               Row_Number()OVER(partition BY Rua, CP Order by id) AS RN
        FROM   Codigo C1) A
WHERE  RN = 1 

I have ordered the duplicate records using Id column and picked the first record out of duplicates.

Upvotes: 3

Gurwinder Singh
Gurwinder Singh

Reputation: 39457

Using Window Function (Single table read):

SELECT t.*
FROM
  (SELECT t.*, COUNT(1) OVER (partition BY Rua, CP) AS cnt FROM Codigo t
  ) t
WHERE cnt = 1;

Using EXISTS (Reads the table twice):

SELECT *
FROM Codigo t1
WHERE EXISTS
  (SELECT 1
  FROM Codigo t2
  WHERE t1.Rua    = t2.Rua
  AND t1.CP       = t2.CP
  GROUP BY Rua, CP 
  HAVING COUNT(*) = 1
  );

JOIN (Twice again):

SELECT C1.*
FROM Codigo C1
INNER JOIN
  (SELECT Rua, CP FROM Codigo C2 GROUP BY Rua, CP HAVING COUNT(*) = 1
  ) C2
ON C1.Rua = C2.Rua
AND C1.CP = C2.CP; 

Upvotes: 2

Related Questions