Kevin
Kevin

Reputation: 6302

Find one single row for a column with a unique value using SQL

I have a table which contains data that similar to this:

RowID      |    CustomerID   |    Quantity      |    Type        | .....
  1        |        345      |       100        |    Software    | .....
  2        |       1280      |       200        |    Software    | .....
  3        |        456      |        20        |      Hub       | .....
  4        |        345      |       100        |    Software    | .....
  5        |        345      |       180        |    Monitor     | .....
  6        |        23       |        15        |    Router      | .....
  7        |       1280      |       120        |    Software    | .....
  8        |        345      |         5        |       Mac      | .....
....       |        ....     |       ...        |     .....      | .....

The database have hundreds of thousand of rows. As you can see, the CustomerID has duplicates.

What I want to do is to find EXACTLY ONE row for each unique CustomerID and Type combination and with Quantity more than 10.

For example, for the above table, I want to get:

RowID      |    CustomerID   |    Quantity      |    Type        | .....
  2        |       1280      |       200        |    Software    | .....
  3        |        456      |        20        |      Hub       | .....
  4        |        345      |       100        |    Software    | .....
  5        |        345      |       180        |    Monitor     | .....
  6        |        23       |        15        |    Router      | .....

What I tried to do is:

select distinct CustomerID, Type from MyTable
where Quantity > 10

Which gives me:

 CustomerID   |     Type   
    1280      |    Software
     456      |      Hub   
     345      |    Software
     345      |     Monitor
     23       |     Router 

But I don't know how to select other columns because if I do:

select distinct CustomerID, Type, RowID, Quantity from MyTable
where Quantity > 10

It returns every rows because the RowID is unique.

I think maybe I should use a subquery by iterating the result of the above query. Can someone help me on this?

Upvotes: 2

Views: 82

Answers (4)

jpw
jpw

Reputation: 44921

One way is to use the row_number window function as partition the data by CustomerID and Type, and the filter out the first rows in each partition.

WITH Uniq AS (
  SELECT 
    CustomerID, Type, RowID, Quantity, 
    rn = ROW_NUMBER() OVER (PARTITION BY CustomerID, Type ORDER BY RowID) 
  FROM MyTable WHERE Quantity > 10
)
SELECT * FROM Uniq WHERE rn = 1;

SQL Fiddle

Or you could find the a unique RowID (min or max) for each group of CustomerID and Type and use that as a source in a join, either as a common table expression of derived table:

WITH Uniq AS (
 SELECT MIN(RowID) RowID FROM MyTable WHERE Quantity > 10 GROUP BY CustomerID, Type
)
SELECT MyTable.* FROM MyTable JOIN Uniq ON MyTable.RowID = Uniq.RowID

Sample SQL Fiddle

Upvotes: 0

RandomGuy
RandomGuy

Reputation: 667

You need to choose which one of the "duplicated" rows to retrieve.

I wrote duplicated with quotes because they are not technically duplicated:

+-------+------------+----------+----------+
| RowID | CustomerID |   Type   | Quantity |
+-------+------------+----------+----------+
|     1 |        345 | Software |      100 |
|     2 |        345 | Software |      200 |
|     3 |        345 | Software |      300 |
+-------+------------+----------+----------+

All of this are different rows because of the different RowID and Quantity columns.

So you must to specify which one of these you want to retrieve.

For this example I will use the RowID and Quantity with the minimum value. So I will tell SQL to pick this one, for this I will order the table by RowID and Quantity in ascending order and I will do a join with the same table so I can pick up the first row with the lower RowID and Quantity for the same CustomerID and Type.

+-------+------------+----------+----------+
| RowID | CustomerID |   Type   | Quantity |
+-------+------------+----------+----------+
|     1 |        345 | Software |      100 |
+-------+------------+----------+----------+

The SQL code for this is the following:

SELECT 
      * 
FROM 
      MyTable originalTable
WHERE 
      originalTable.Quantity > 10 AND
      originalTable.RowID =
(
   SELECT TOP 1 orderedTable.RowID 
   FROM MyTable orderedTable 
   WHERE orderedTable.CustomerID = originalTable.CustomerID AND orderedTable.Type = originalTable.Type
   ORDER BY orderedTable.RowID ASC, orderedTable.Quantity ASC
)

Upvotes: 0

Aeroradish
Aeroradish

Reputation: 371

Use Partition Over. This will allow you to group all similar rows together, and then you query that table to get just the first row. Note: An "order by" must be specified in the partition, even if you don't use the value. But it is useful for pulling the combination with the highest quantity. If you also want distinct Quantity, add that column to the select in the partition.

select CustomerId
, Type  
FROM
 (
    select
    CustomerId
    , Type
    , row_number() over (partition by CustomerId, Type order by Quantity desc) as rn
From MyTable
where Quantity > 10
) dta
Where rn = 1

Upvotes: 1

Andrey Belykh
Andrey Belykh

Reputation: 2654

Something like this will work (unless you have more requirements that you didn't mention):

SELECT CustomerID, Type, SUM(Quantity) AS Quantity
FROM MyTable
GROUP BY CustomerID, Type
HAVING SUM(Quantity) > 10

Upvotes: 0

Related Questions