hncl
hncl

Reputation: 2295

SQL GROUP BY CASE

I am trying to get the age of visitor from three tables: Visits that holds the viist date by client ID, and Clients that holds the Client ID, and Clinic that holds the Clinic ID.

Here is my sql statement:

DECLARE @Clinicname nvarchar(50) 
SET @Clinicname='First Clinic'

SELECT CASE 
         WHEN DATEDIFF(YEAR, c.BirthDate, MIN(v.Date)) <= 10 THEN '1-10' 
         WHEN DATEDIFF(YEAR, c.BirthDate, MIN(v.Date)) <= 20 THEN '11-20' 
         WHEN DATEDIFF(YEAR, c.BirthDate, MIN(v.Date)) <= 30 THEN '21-30'
         WHEN DATEDIFF(YEAR, c.BirthDate, MIN(v.Date)) <= 40 THEN '31-40' 
         WHEN DATEDIFF(YEAR, c.BirthDate, MIN(v.Date)) <= 50 THEN '41-50'
         WHEN DATEDIFF(YEAR, c.BirthDate, MIN(v.Date)) <= 60 THEN '51-60'
         WHEN DATEDIFF(YEAR, c.BirthDate, MIN(v.Date)) <= 70 THEN '61-70'
         WHEN DATEDIFF(YEAR, c.BirthDate, MIN(v.Date)) <= 80 THEN '71-80'
         ELSE '81+' 
       END AS age, 
       COUNT(*) AS n

FROM  Visit v 
          INNER JOIN  Client c ON v.ClientID = c.ClientID
          INNER JOIN  Clinic r ON v.ClinicId = r.ClinicId
          WHERE r.Name IN (@Clinicname)   
GROUP BY CASE 
         WHEN DATEDIFF(YEAR, c.BirthDate, MIN(v.Date)) <= 10 THEN '1-10' 
         WHEN DATEDIFF(YEAR, c.BirthDate, MIN(v.Date)) <= 20 THEN '11-20' 
         WHEN DATEDIFF(YEAR, c.BirthDate, MIN(v.Date)) <= 30 THEN '21-30'
         WHEN DATEDIFF(YEAR, c.BirthDate, MIN(v.Date)) <= 40 THEN '31-40' 
         WHEN DATEDIFF(YEAR, c.BirthDate, MIN(v.Date)) <= 50 THEN '41-50'
         WHEN DATEDIFF(YEAR, c.BirthDate, MIN(v.Date)) <= 60 THEN '51-60'
         WHEN DATEDIFF(YEAR, c.BirthDate, MIN(v.Date)) <= 70 THEN '61-70'
         WHEN DATEDIFF(YEAR, c.BirthDate, MIN(v.Date)) <= 80 THEN '71-80'
         ELSE '81+'
         END 

Here is the error: Cannot use an aggregate or a subquery in an expression used for the group by list of a GROUP BY clause.

When I replace MIN(v.Date) with GETDATE() it works.

Upvotes: 0

Views: 3912

Answers (2)

anon
anon

Reputation:

Some guesses, based on loose specs, unknown version, etc. One piece of advice: DATEDIFF(YEAR is not a reliable way to calculate age. Think about cases where the person's birthday is in December and they have a visit in January, or vice versa. DATEDIFF only counts how many boundaries have been crossed, it doesn't care if the person's birthday has happened yet.

You could probably combine some of these query fragments, but chunking them out may make the logic more digestible.

SQL Server 2005+

;WITH vis AS
(
    SELECT 
        v.ClientId, 
        FirstVisit = MIN(v.[Date]), 
        NumVisits = COUNT(*)
    FROM dbo.Visit AS v
    INNER JOIN dbo.Clinic AS c
    ON c.ClinicId = v.ClinicId
    WHERE c.Name IN (@Clinicname)
    GROUP BY v.ClientId
),
ages AS (
    SELECT 
        v.ClientId,
        rough_age = DATEDIFF(YEAR, c.BirthDate, v.FirstVisit),
        v.NumVisits
    FROM vis AS v
    INNER JOIN dbo.Client AS c
    ON v.ClientId = c.ClientId
),
cats([group], numVisits) AS (
    SELECT 
        CASE WHEN rough_age/10 > 8 THEN '81+' ELSE
        CONVERT(VARCHAR(32), ((rough_age/10)+1)*10-9) + '-'
        + CONVERT(VARCHAR(12), ((rough_age/10)+1)*10) END,
        numVisits
   FROM ages
)
SELECT [group], NumClients = COUNT(*), NumVisits = SUM(numVisits)
FROM cats
GROUP BY [group];   

SQL Server 2000

SELECT [group], NumClients = COUNT(*), NumVisits = SUM(numVisits)
FROM (
    SELECT 
        [group] = CASE WHEN rough_age/10 > 8 THEN '81+' ELSE
        CONVERT(VARCHAR(32), ((rough_age/10)+1)*10-9) + '-'
        + CONVERT(VARCHAR(12), ((rough_age/10)+1)*10) END,
        numVisits
   FROM 
   (
    SELECT 
        v.ClientId,
        rough_age = DATEDIFF(YEAR, c.BirthDate, v.FirstVisit),
        v.NumVisits
    FROM 
    (
        SELECT 
            v.ClientId, 
            FirstVisit = MIN(v.[Date]), 
            NumVisits = COUNT(*)
        FROM dbo.Visit AS v
        INNER JOIN dbo.Clinic AS c
        ON c.ClinicId = v.ClinicId
        WHERE c.Name = @Clinicname
        GROUP BY v.ClientId
    ) AS v
    INNER JOIN dbo.Client AS c
    ON v.ClientId = c.ClientId
  ) AS ages
) AS cats
GROUP BY [group];   

Upvotes: 3

Andrew Shepherd
Andrew Shepherd

Reputation: 45252

You can get around the grouping issue by using an inner query. This would be preferable in any case as you do not have to duplicate the bucketing logic.

I've also removed the MIN(v.Date) because I don't think that is really necessary.

DECLARE 
     @Clinicname nvarchar(50);

SET @Clinicname='First Clinic'


select age, count(*) from
(
    SELECT CASE 
         WHEN DATEDIFF(YEAR, c.BirthDate, v.Date) <= 10 THEN '1-10' 
         WHEN DATEDIFF(YEAR, c.BirthDate, v.Date) <= 20 THEN '11-20' 
         WHEN DATEDIFF(YEAR, c.BirthDate, v.Date) <= 30 THEN '21-30'
         WHEN DATEDIFF(YEAR, c.BirthDate, v.Date) <= 40 THEN '31-40' 
         WHEN DATEDIFF(YEAR, c.BirthDate, v.Date) <= 50 THEN '41-50'
         WHEN DATEDIFF(YEAR, c.BirthDate, v.Date) <= 60 THEN '51-60'
         WHEN DATEDIFF(YEAR, c.BirthDate, v.Date) <= 70 THEN '61-70'
         WHEN DATEDIFF(YEAR, c.BirthDate, v.Date) <= 80 THEN '71-80'
         ELSE '81+' 
       END AS age
    FROM  Visit v 
          INNER JOIN  Client c ON v.ClientID = c.ClientID
          INNER JOIN  Clinic r ON v.ClinicId = r.ClinicId
          WHERE r.Name IN (@Clinicname) 
) t group by age;

Upvotes: 0

Related Questions