Reputation: 89
I have looked at many examples but i cannot figure this out. Might be too simple for my tired brain. I have a table, I need to select all columns, with one distinct.
ID CAT PRODUCT
=======================
1 21 Product1
2 21 Product2
3 23 Product2
4 24 Product3
5 24 Product5
6 25 Product4
7 25 Product6
I need to return
ID CAT PRODUCT
=======================
1 21 Product1
3 23 Product2
4 24 Product3
7 25 Product6
ID and PRODUCT may or may not be the same. No condition needed. Category (CAT) needs to be distinct. THE Values returned need to be from same row.
In Mysql I can do the following, but MSSQL does not like it.
SELECT * FROM table GROUP BY CAT
Upvotes: 1
Views: 2654
Reputation: 2171
If ID is unique or primary key, then alternative solution would be:
SELECT tn.ID, tn.Cat, tn.Product -- or just tn.* if all columns are to be selected
FROM
(
SELECT ID = MIN(ID)
FROM TableName
GROUP BY CAT
) x
INNER JOIN TableName tn ON
x.ID = tn.ID
Nested 'x' query selects minimum ID for each unique category CAT and inner join returns rows from TableName with required ID's.
Original answer.
Now it should be read as 'What you should not do if you want the column values to be from the same original row'. The example at the bottom was changed to show how data gets mixed.
When Group By is used TSQL requires that all columns used should be either listed in GROUP BY clause or put inside aggregate functions (MIN, MAX, AVG, COUNT, etc.).
So, we need to choose which columns have to be distinct, and which columns should be put inside aggregate functions and what aggregate functions will be suitable for our purpose.
You need the CAT
column to be distinct - you should GROUP BY it.
ID and PRODUCT columns are not constrained by any condition - so, we may aggregate them. From all aggregate functions only MIN and MAX suit our purpose, because they do not change the type of the column they are applied to.
From your example the ID column always gets minimum value, so best option is to use MIN.
As for PRODUCT - you can use both MIN or MAX - I can see no difference in your current example.
SELECT MIN(Id), Cat, MAX(Product)
FROM TableName
GROUP BY Cat
How it works:
ID CAT PRODUCT ID CAT PRODUCT
================= ================
1 21 Product1 1 21 -- 1 is minimums for CAT 21
2 21 Product2 Product2 -- Product2 is maximum for CAT 21
----------------- ----------------
3 23 Product2 3 23 Product2 -- Only one row for CAT 23
----------------- ----------------
4 24 Product3 4 24 -- 4 is minimum for CAT 24
5 24 Product5 Product5 -- Product5 is maximum for CAT 24
----------------- ----------------
6 25 Product4 6 25 -- 6 is minimum for CAT 25
7 25 Product6 Product6 -- Product6 is maximum for CAT 25
So, the result would be:
ID CAT PRODUCT
=================
1 21 Product2
3 23 Product2
4 24 Product5
6 25 Product6
As you can the CAT's are distinct, but column values are mixed between rows.
Upvotes: 2
Reputation: 70523
To get the first row as ordered by id, distinct on cat you would use this
select id, cat, product from
(
Select *,
ROW_NUMBER() OVER (PARTITION BY cat ORDER BY ID) as num
from table
) t
where num = 1
This creates a number of each element in the groups which would be created by a group by (as below) and then select the first one.
This would do it, I can't imagine why you would want this result.
select max(id) , cat, max(product)
from table
group by cat
or
select min(id) , cat, min(product)
from table
group by cat
would work.
You need to use an aggregate function, this function picks which item to use in the set of values found by the group by.
So for example when you group by cat you have two rows that need to be "combined".
1 21 Product1
2 21 Product2
for the id column and the product column you have to pick one of the two values to use if you use the aggregate function max() you will pick the max of those values. In the same way min() will give you the min of those values.
Upvotes: 4