Reputation: 6302
I have a table which contains data that similar to this:
RowID | CustomerID | Quantity | Type | .....
1 | 345 | 100 | Software | .....
2 | 1280 | 200 | Software | .....
3 | 456 | 20 | Hub | .....
4 | 345 | 100 | Software | .....
5 | 345 | 180 | Monitor | .....
6 | 23 | 15 | Router | .....
7 | 1280 | 120 | Software | .....
8 | 345 | 5 | Mac | .....
.... | .... | ... | ..... | .....
The database have hundreds of thousand of rows. As you can see, the CustomerID has duplicates.
What I want to do is to find EXACTLY ONE
row for each unique CustomerID and Type combination and with Quantity more than 10.
For example, for the above table, I want to get:
RowID | CustomerID | Quantity | Type | .....
2 | 1280 | 200 | Software | .....
3 | 456 | 20 | Hub | .....
4 | 345 | 100 | Software | .....
5 | 345 | 180 | Monitor | .....
6 | 23 | 15 | Router | .....
What I tried to do is:
select distinct CustomerID, Type from MyTable
where Quantity > 10
Which gives me:
CustomerID | Type
1280 | Software
456 | Hub
345 | Software
345 | Monitor
23 | Router
But I don't know how to select other columns because if I do:
select distinct CustomerID, Type, RowID, Quantity from MyTable
where Quantity > 10
It returns every rows because the RowID is unique.
I think maybe I should use a subquery by iterating the result of the above query. Can someone help me on this?
Upvotes: 2
Views: 82
Reputation: 44921
One way is to use the row_number window function as partition the data by CustomerID
and Type
, and the filter out the first rows in each partition.
WITH Uniq AS (
SELECT
CustomerID, Type, RowID, Quantity,
rn = ROW_NUMBER() OVER (PARTITION BY CustomerID, Type ORDER BY RowID)
FROM MyTable WHERE Quantity > 10
)
SELECT * FROM Uniq WHERE rn = 1;
Or you could find the a unique RowID (min or max) for each group of CustomerID and Type and use that as a source in a join, either as a common table expression of derived table:
WITH Uniq AS (
SELECT MIN(RowID) RowID FROM MyTable WHERE Quantity > 10 GROUP BY CustomerID, Type
)
SELECT MyTable.* FROM MyTable JOIN Uniq ON MyTable.RowID = Uniq.RowID
Upvotes: 0
Reputation: 667
You need to choose which one of the "duplicated" rows to retrieve.
I wrote duplicated with quotes because they are not technically duplicated:
+-------+------------+----------+----------+
| RowID | CustomerID | Type | Quantity |
+-------+------------+----------+----------+
| 1 | 345 | Software | 100 |
| 2 | 345 | Software | 200 |
| 3 | 345 | Software | 300 |
+-------+------------+----------+----------+
All of this are different rows because of the different RowID and Quantity columns.
So you must to specify which one of these you want to retrieve.
For this example I will use the RowID and Quantity with the minimum value. So I will tell SQL to pick this one, for this I will order the table by RowID and Quantity in ascending order and I will do a join with the same table so I can pick up the first row with the lower RowID and Quantity for the same CustomerID and Type.
+-------+------------+----------+----------+
| RowID | CustomerID | Type | Quantity |
+-------+------------+----------+----------+
| 1 | 345 | Software | 100 |
+-------+------------+----------+----------+
The SQL code for this is the following:
SELECT
*
FROM
MyTable originalTable
WHERE
originalTable.Quantity > 10 AND
originalTable.RowID =
(
SELECT TOP 1 orderedTable.RowID
FROM MyTable orderedTable
WHERE orderedTable.CustomerID = originalTable.CustomerID AND orderedTable.Type = originalTable.Type
ORDER BY orderedTable.RowID ASC, orderedTable.Quantity ASC
)
Upvotes: 0
Reputation: 371
Use Partition Over. This will allow you to group all similar rows together, and then you query that table to get just the first row. Note: An "order by" must be specified in the partition, even if you don't use the value. But it is useful for pulling the combination with the highest quantity. If you also want distinct Quantity, add that column to the select in the partition.
select CustomerId
, Type
FROM
(
select
CustomerId
, Type
, row_number() over (partition by CustomerId, Type order by Quantity desc) as rn
From MyTable
where Quantity > 10
) dta
Where rn = 1
Upvotes: 1
Reputation: 2654
Something like this will work (unless you have more requirements that you didn't mention):
SELECT CustomerID, Type, SUM(Quantity) AS Quantity
FROM MyTable
GROUP BY CustomerID, Type
HAVING SUM(Quantity) > 10
Upvotes: 0