AJ_
AJ_

Reputation: 3977

SQL - Finding multiple duplicates

I have a table of products. These products have a product id. The product id can be shared by products. They are differentiated by the version id. For example bike has an id of 1. There are different kinds of bikes that share p_id of 1 but have v_id of 1-15. In the table there are many duplicates and i would like to find them. I made a query that finds the duplicates of the products. So i can see the count of versions of the product, but some of the products were inserted twice. I wrote a query to view the count of versions of the products, but now i would like to see all the duplicates. There is a column called product Value (the description) which is the indicator if its a duplicate.

Example

productKey  productValue    cout
16293   Bike    2
16292   Bike    2
16291   Bike    2
16290   Pads    2
16289   Pads    2
16288   Helmet  3
16286   Shoe    2

From here you can see bike and pads show up 3 times, but Helmet & Shoe only show up once. I would like to edit my duplicate query (or add to it), so that i can just get the items that show up more than once (aka Helmet and shoe would not appear ) The third column is the version, but that can be ignored.

Query

SELECT productKey, productValue, COUNT(*) as cout
FROM [Store].[dbo].[products]
GROUP BY productKey, productValue
HAVING COUNT(*) > 1
ORDER BY productKey DESC

Wanted result, something like this

productKey  productValue    cout
16293   Bike    2
16292   Bike    2
16291   Bike    2
16290   Pads    2
16289   Pads    2

Upvotes: 0

Views: 525

Answers (6)

Yogesh Sharma
Yogesh Sharma

Reputation: 50163

Please try below SQL Query which could help you to achieve the result :

;WITH CTE
     AS (
     SELECT productKey,
            productValue,
            COUNT(*) OVER(PARTITION BY productValue) AS ActualCnt
     FROM products
     GROUP BY productKey,
              productValue)
     SELECT T.productKey,
            T.productValue,
            T.cout
     FROM CTE C
          INNER JOIN products T ON T.productValue = C.productValue
     WHERE C.ActualCnt > 1
     GROUP BY T.productKey,
              T.productValue,
              T.cout;

Desired Output :

productKey  productValue   cout
----------- ------------- -------
16289       Pads           2
16290       Pads           2
16291       Bike           2
16292       Bike           2
16293       Bike           2

Upvotes: 0

Thorsten Kettner
Thorsten Kettner

Reputation: 94859

What you really want (from what I understand now): Duplicate product names, but ignoring duplicates within a product number, because these are versions of the same product and hence of course not considered a problem.

So you are looking for product names where you count more than one product number. Use COUNT(DISTINCT ProductKey) for this.

select * 
from
(
  select products.*, count(distinct productkey) over (partition by productValue) as cnt
  from products
) counted
where cnt > 1;

Upvotes: 0

Ilyes
Ilyes

Reputation: 14928

Here you go:

DECLARE @T TABLE (productKey INT, productValue VARCHAR(30), cout INT);
INSERT INTO @T VALUES
(16293,   'Bike',  2),
(16292,   'Bike',  2),
(16291,   'Bike',  2),
(16290,   'Pads',  2),
(16289,   'Pads',  2),
(16288,   'Helmet',  3),
(16286,   'Shoe',  2);

SELECT *
FROM @T
WHERE  productValue IN (SELECT productValue FROM (SELECT productValue,COUNT(productValue) N FROM @T GROUP BY productValue) AS T WHERE T.N > 1) ;

Result:

+------------+--------------+------+
| productKey | productValue | cout |
+------------+--------------+------+
|      16293 | Bike         |    2 |
|      16292 | Bike         |    2 |
|      16291 | Bike         |    2 |
|      16290 | Pads         |    2 |
|      16289 | Pads         |    2 |
+------------+--------------+------+

Upvotes: 0

Pரதீப்
Pரதீப்

Reputation: 93694

Use Count() Over() window aggregate function

Select * from 
(
select *,
       cout = Count(1)over(partition by productValue)
from [Store].[dbo].[products]
) a
Wher cnt > 1

If you want to use Group By then you need sub-query

Select * 
from  [Store].[dbo].[products] 
where productValue in (SELECT productValue 
                       FROM [Store].[dbo].[products] 
                       GROUP BY productValue HAVING COUNT(*) > 1) 

Upvotes: 1

Thorsten Kettner
Thorsten Kettner

Reputation: 94859

You want to find products where two versions refer to the same thing. (At least this is what your query does.) And from these products that have duplicate versions you want to select only those products where the product title occurs in at least two problem products. That doesn't seem to make much sense, but here you go:

SELECT productKey, productValue, cout
FROM
(
  SELECT 
    productKey, 
    productValue, 
    COUNT(*) as cout,
    COUNT(*) OVER (PARTITION BY productValue) as cnt
  FROM Store.dbo.products
  GROUP BY productKey, productValue
  HAVING COUNT(*) > 1
) counted
WHERE cnt > 1
ORDER BY productKey DESC;

Upvotes: 1

Abdulkadir Erkmen
Abdulkadir Erkmen

Reputation: 202

You can select duplicate rows in subquery and select them in your main query



  SELECT 
         productKey, productValue,cout
    FROM 
         [Store].[dbo].[products] p1
    WHERE 
         productKey IN( 
               SELECT productKey
               FROM [Store].[dbo].[products]
               GROUP BY productKey, productValue
               HAVING COUNT(*) > 1
               )

Upvotes: 0

Related Questions