Randster
Randster

Reputation: 298

Aggregating Several Columns in SQL

Suppose I have a table that looks like the following

id | location | dateHired | dateRehired | dateTerminated
1  | 1        | 10/1/2011 | NULL        | 12/1/2011
2  | 1        | 10/3/2011 | 11/1/2011   | 12/31/2011
3  | 5        | 10/5/2011 | NULL        | NULL
4  | 5        | 10/5/2011 | NULL        | NULL
5  | 7        | 11/5/2011 | NULL        | 12/1/2011
6  | 10       | 11/2/2011 | NULL        | NULL

and I wanted to condense that into a summary table such that:

location | date        | hires  | rehires |   terms
1        |  10/1/2011  |   1    |    0    |     0
1        |  10/3/2011  |   1    |    0    |     0
1        |  11/1/2011  |   0    |    1    |     0
1        |  12/1/2011  |   0    |    0    |     1
1        |  12/31/2011 |   1    |    0    |     0
5        |  10/5/2011  |   2    |    0    |     0

etc.

-- what would that SQL look like? I was thinking it would be something to the effect of:

SELECT
  e.location
  , -- ?
  ,SUM(CASE WHEN e.dateHired IS NOT NULL THEN 1 ELSE 0 END) AS Hires
  ,SUM(CASE WHEN e.dateRehired IS NOT NULL THEN 1 ELSE 0 END) As Rehires
  ,SUM(CASE WHEN e.dateTerminated IS NOT NULL THEN 1 ELSE 0 END) As Terms
FROM
  Employment e
GROUP BY
  e.Location
  ,--?

But I'm not real keen if that's entirely correct or not?

EDIT - This is for SQL 2008 R2.

Also,

INNER JOIN on the date columns assumes that there are values for all three categories, which is false; which is the original problem I was trying to solve. I was thinking something like COALESCE, but that doesn't really make sense either.

Upvotes: 0

Views: 183

Answers (3)

Ben Thul
Ben Thul

Reputation: 32707

How about something like:

with dates as (
    select distinct location, d from (
        select location, dateHired as [d]
        from tbl
        where dateHired is not null

        union all

        select location, dateRehired 
        from tbl
        where dateRehired is not null

        union all  

        select location, dateTerminated
        from tbl
        where dateTerminated is not null
    )
)

select location, [d],
    (
        select count(*) 
        from tbl 
        where location = dates.location 
            and dateHired = dates.[d]
    ) as hires,
    (
        select count(*) 
        from tbl 
        where location = dates.location 
            and dateRehired = dates.[d]
    ) as rehires,
    (
        select count(*) 
        from tbl 
        where location = dates.location 
            and dateTerminated = dates.[d]
    ) as terms
from dates

I don't have a SQL server handy, or I'd test it out.

Upvotes: 1

Justin Pihony
Justin Pihony

Reputation: 67085

I am sure there is probably an easier, more elegant way to solve this. However, this is the simplest, quickest that I can think of this late that works.

CREATE TABLE #Temp
(
    Location INT,
    Date DATETIME,
    HireCount INT,
    RehireCount INT,
    DateTerminatedCount INT
)

--This will keep us from having to do an insert if does not already exist
INSERT INTO #Temp (Location, Date)
SELECT DISTINCT Location, DateHired FROM Employment
UNION
SELECT DISTINCT Location, DateRehired FROM Employment
UNION
SELECT DISTINCT Location, DateTerminated FROM Employment

UPDATE #Temp
SET HireCount = Hired.HireCount
FROM #Temp
JOIN
(
    SELECT Location, DateHired AS Date, SUM(*) AS HireCount 
    FROM Employment
    GROUP BY Location, DateHired
) AS Hired

UPDATE #Temp
SET RehireCount= Rehire.RehireCount
FROM #Temp
JOIN
(
    SELECT Location, DateRehired AS Date, SUM(*) AS RehireCount
    FROM Employment
    GROUP BY Location, DateRehired
) AS Rehire
    ON Rehire.Location = #Temp.Location AND Rehire.Date = #Temp.Date

UPDATE #Temp
SET DateTerminatedCount = Terminated.DateTerminatedCount
FROM #Temp
JOIN
(
    SELECT Location, DateTerminated AS Date, SUM(*) AS DateTerminatedCount
    FROM Employment
    GROUP BY Location, DateTerminated
) AS Terminated
    ON Terminated.Location = #Temp.Location AND Terminated.Date = #Temp.Date

SELECT * FROM #Temp

Upvotes: 1

simple
simple

Reputation: 65

SELECT * FROM  
(SELECT location, dateHired as date, COUNT(1) as hires FROM mytable GROUP BY location, date) H  
INNER JOIN  
(SELECT location, dateReHired as date, COUNT(1) as rehires FROM mytable GROUP BY location, date) R ON H.location = R.location AND H.dateHired = R.dateRehired  
INNER JOIN 
(SELECT location, dateTerminated as date, COUNT(1) as terminated FROM mytable GROUP BY  location, date) T  
ON H.location = T.location AND H.dateHired = T.dateTerminated

Upvotes: 0

Related Questions