Reputation: 141
I have a table like this:
CREATE TABLE Table1
([IdeaNr] int, [SubmitterName] varchar(4), [SubmitterDepartment] varchar(4))
;
INSERT INTO Table1
([IdeaNr], [SubmitterName], [SubmitterDepartment])
VALUES
(1, 'Joe', 'Org1'),
(1, 'Bill', 'Org2'),
(1, 'Kate', 'Org1'),
(1, 'Tom', 'Org3'),
(2, 'Sue', 'Org2'),
(3, 'Bill', 'Org2'),
(3, 'Fred', 'Org1'),
(4, 'Ted', 'Org3'),
(4, 'Kate', 'Org1'),
(4, 'Hank', 'Org3')
;
I want get the following result from a query:
IdeaNr SubmitterCount SubmitterRndName SubmitterRndDepartment
1 4 Joe or ... Org1 (if Joe)
2 1 Sue Org2
3 2 Bill or ... Org2 (if Bill)
4 3 Ted or ... Org3 (if Ted)
I have tried a lot of things with all kind of JOINs of Table1 with itself, derived tables and GROUP BY, e.g.:
SELECT COUNT(IdeaNr) AS SubmitterCount,IdeaNr,SubmitterName,SubmitterDepartment
FROM Table1
GROUP BY IdeaNr,SubmitterName,SubmitterDepartment
I think the problem is to create an algorithm that takes just the first (or a random) name and department appearing in a group of IdeaNr. It is absolutely clear that you can get to misleading interpretations of that kind of data, e. g.:
But this kind of "wrong averaging" is OK for the task. Can you help?
EDIT: The expected result evolved during the discussion. The desired result changed to:
IdeaNr SubmitterCount SubmitterRndName SubmitterRndDepartment
1 4 Joe, Bill, ... GroupIdea
2 1 Sue Org2
3 2 Bill, Fred GroupIdea
4 3 Ted, ... GroupIdea
Upvotes: 4
Views: 80
Reputation: 67311
Try it like this:
DECLARE @Table1 TABLE ([IdeaNr] int, [SubmitterName] varchar(4), [SubmitterDepartment] varchar(4));
INSERT INTO @Table1([IdeaNr], [SubmitterName], [SubmitterDepartment])
VALUES
(1, 'Joe', 'Org1'),
(1, 'Bill', 'Org2'),
(1, 'Kate', 'Org1'),
(1, 'Tom', 'Org3'),
(2, 'Sue', 'Org2'),
(3, 'Bill', 'Org2'),
(3, 'Fred', 'Org1'),
(4, 'Ted', 'Org3'),
(4, 'Kate', 'Org1'),
(4, 'Hank', 'Org3');
SELECT x.IdeaNr
,Count(x.IdeaNr)
,MAX(Submitter.SubmitterName) AS SubmitterRndName
,MAX(Submitter.SubmitterDepartment) AS SubmitterRndDepartment
FROM @Table1 AS x
CROSS APPLY
(
SELECT TOP 1 SubmitterName, SubmitterDepartment
FROM @Table1 AS y
WHERE y.IdeaNr=x.IdeaNr
) AS Submitter
GROUP BY x.IdeaNr
There is one more idea, don't know if you could need this:
SELECT x.IdeaNr
,Count(x.IdeaNr)
,STUFF(
(
SELECT ', ' + y.SubmitterName --maybe with DISTINCT
FROM @Table1 AS y
WHERE y.IdeaNr=x.IdeaNr
FOR XML PATH('')
),1,2,'') AS AllSubmitters
,STUFF(
(
SELECT ', ' + z.SubmitterDepartment --maybe with DISTINCT
FROM @Table1 AS z
WHERE z.IdeaNr=x.IdeaNr
FOR XML PATH('')
),1,2,'') AS AllDepartments
FROM @Table1 AS x
GROUP BY x.IdeaNr
This comes back with
IdeaNr AllSubmitters AllDepartments
1 4 Joe, Bill, Kate, Tom Org1, Org2, Org1, Org3
2 1 Sue Org2
3 2 Bill, Fred Org2, Org1
4 3 Ted, Kate, Hank Org3, Org1, Org3
EDIT: Following your idea from the last comment:
SELECT x.IdeaNr
,COUNT(x.IdeaNr)
,STUFF(
(
SELECT DISTINCT ', ' + y.SubmitterName
FROM @Table1 AS y
WHERE y.IdeaNr=x.IdeaNr
FOR XML PATH('')
),1,2,'') AS AllSubmitters
,CASE WHEN COUNT(x.IdeaNr)=1 THEN (SELECT TOP 1 z.SubmitterDepartment FROM @Table1 AS z WHERE z.IdeaNr=x.IdeaNr)
ELSE 'GroupIdea' END AS Departments
FROM @Table1 AS x
GROUP BY x.IdeaNr
Upvotes: 3
Reputation: 32693
If you want to read more about this topic search for top-N-per-group
. In SQL Server it is easy to do using CROSS APPLY
.
WITH
CTE
AS
(
SELECT
IdeaNr
,COUNT(*) AS SubmitterCount
FROM @Table1
GROUP BY IdeaNr
)
SELECT
CTE.IdeaNr
,CTE.SubmitterCount
,CA.SubmitterName
,CA.SubmitterDepartment
FROM
CTE
CROSS APPLY
(
SELECT TOP(1)
T.SubmitterName
,T.SubmitterDepartment
FROM @Table1 AS T
WHERE T.IdeaNr = CTE.IdeaNr
--ORDER BY T.SubmitterName
--ORDER BY T.SubmitterDepartment
--ORDER BY CRYPT_GEN_RANDOM(4)
) AS CA
ORDER BY CTE.IdeaNr;
If you don't put any ORDER BY
in the CROSS APPLY
part the server will pick one "random" row. It is not random as such, but results may be the same or may differ when you run this query several times. In practice, results will most likely differ if you create or drop indexes on the table, but if the table is large they may differ every time the query runs.
If you want to pick some specific row for each IdeaNr
, then use ORDER BY
Name or Department or some ID, etc.
If you want to pick a really random row, then ORDER BY CRYPT_GEN_RANDOM(4)
.
I get the following result without any ORDER BY
when I use table variable for this test without any indexes:
IdeaNr SubmitterCount SubmitterName SubmitterDepartment
1 4 Joe Org1
2 1 Sue Org2
3 2 Bill Org2
4 3 Ted Org3
It looks as if it picked the "first" row for each IdeaNr
in the order as they were added to the table. But, don't be fooled, the order is not guaranteed without explicit ORDER BY
. If you want to get the first row for each IdeaNr
in the order as they were added to the table, you need to store information about this order somehow. For example, add a column ID int IDENTITY
to the table that would increment automatically as new rows are added and then you can use it like this ORDER BY ID DESC
to get guaranteed results.
Play with SQL Fiddle to see how it works.
Upvotes: 3