Chris B
Chris B

Reputation: 709

Simple SQL Server COUNT query (counting changes to values in a column)

I have a table with columns: MONTH, YEAR, PROJECT_ID, STATUS.

Status can be:

I want to know how many projects completed in a given month i.e. :

where STATUS changed from anything that is NOT C to C;

It sounds simple...!

It's easy to find when any given project completed with:

SELECT TOP 1 MONTH,YEAR,PROJECT_ID FROM Table WHERE PROJECT_ID=9236 AND RAG='C'
  ORDER BY YEAR ASC, MONTH ASC

But given year = 2011 and month = 8 (for example), I have no idea how to find the number of projects that had status='C' for the first time that month. Any ideas?

Edit: projects are still included as rows with status='C' after they complete, so I can't just count the Cs as that will return the number of projects that completed in this AND previous months (hence the chronological ordering and select top 1).

Sample data for 10/2010 to 01/2011 months:

Month | Year | Project | Status
-------------------------------
10    | 2010 | A       | G
11    | 2010 | A       | C
12    | 2010 | A       | C
1     | 2011 | A       | C
10    | 2010 | B       | R
11    | 2010 | B       | R
12    | 2010 | B       | R
1     | 2011 | B       | R
10    | 2010 | C       | G
11    | 2010 | C       | G
12    | 2010 | C       | G
1     | 2011 | C       | C
10    | 2010 | D       | A
11    | 2010 | D       | C
12    | 2010 | D       | C
1     | 2011 | D       | C

^ Projects A and D was completed in 11/2010. Project B hasn't changed to completed in any of the four months shown. Project C was completed in 01/2011. {Month,Year,Project} is the primary key.

So, inputs and outputs would be:

10/2010 => 0
11/2010 => 2 (because of A and D)
12/2010 => 0
1/2011 => 1 (because of C)

Upvotes: 1

Views: 3138

Answers (4)

Silvia Parfeni
Silvia Parfeni

Reputation: 528

I like to use this function: lead() over(). If you have, for example, this select:

select Month, Year, Project, Status
from youTable
where 1 = 1 --if you have any condition

I find next value of "status" column with lead() function and I compare with the current one so :

select count(1) as number from
   (select lead(Status) over(order by Project) as nextStatus,  Month, Year, Project, Status
    from youTable
    where 1=1) as tmp
where tmp.nextStatus <> tmp.Status

now, in number I have the numbers of changed value into "Status" column

Upvotes: 1

Jake Feasel
Jake Feasel

Reputation: 16955

SELECT 
   distinctMonths.month, 
   distinctMonths.year, 
   count(countProjects.project) as numChanges
FROM
  (
  SELECT DISTINCT
    month, year
  FROM
   Table
  ) as distinctMonths -- need to get all months available, independent of the project status, in case there were not an complete ones during a given month
    LEFT OUTER JOIN
    (
      SELECT
        Month, Year, Project
      FROM
        Table
      WHERE
        status = 'C' AND
        NOT EXISTS ( -- this will filter out our result set to only include the earliest instance of the given project's complete status
          SELECT 
            1 
          FROM 
            Table t2 
          WHERE 
            t2.project = Table.project AND
            t2.status = 'C' AND
            ( -- this will convert the date fragments into proper date values, that can be compared easily
                cast(
                    cast(t2.year as varchar) + '-' + cast(t2.month as varchar) + '-1'
                as datetime) 
                < 
                cast(
                    cast(table.year as varchar) + '-' + cast(table.month as varchar) + '-1'
                as datetime) 
            )

        )

    ) as countProjects ON
      distinctMonths.month = countProjects.month AND
      distinctMonths.year = countProjects.year
GROUP BY
   distinctMonths.month, 
   distinctMonths.year
ORDER BY
  distinctMonths.year,
  distinctMonths.month

Upvotes: 1

pkmiec
pkmiec

Reputation: 2694

There you go

WITH 
src(month, year, project, status) AS (
    SELECT 10,2010,'A','G' UNION ALL
    SELECT 11,2010,'A','C' UNION ALL
    SELECT 12,2010,'A','C' UNION ALL
    SELECT 1,2011,'A','C' UNION ALL
    SELECT 10,2010,'B','R' UNION ALL
    SELECT 11,2010,'B','R' UNION ALL
    SELECT 12,2010,'B','R' UNION ALL
    SELECT 1,2011,'B','R' UNION ALL
    SELECT 10,2010,'C','G' UNION ALL
    SELECT 11,2010,'C','G' UNION ALL
    SELECT 12,2010,'C','G' UNION ALL
    SELECT 1,2011,'C','C' UNION ALL
    SELECT 10,2010,'D','A' UNION ALL
    SELECT 11,2010,'D','C' UNION ALL
    SELECT 12,2010,'D','C' UNION ALL
    SELECT 1,2011,'D','C'),
src_date (date, project, status) AS (   
    SELECT date = CONVERT(DATETIME, CONVERT(VARCHAR, year * 100 + month) + '01'), project, status
    FROM src
)
SELECT month = CONVERT(VARCHAR, YEAR(alldates.date)) + '/' + CONVERT(VARCHAR, MONTH(alldates.date)), 
       projects = ISNULL(cnt.value,0)
FROM (
    SELECT DISTINCT date
    FROM src_date
) alldates
LEFT JOIN 
(
    SELECT date = min_date, value = COUNT(*)
    FROM 
    (
        SELECT project, min_date = MIN(date)
        FROM src_date
        WHERE status = 'C'
        GROUP BY project
    ) mins
    GROUP BY min_date
) cnt
ON alldates.date = cnt.date

Upvotes: 1

Sparky
Sparky

Reputation: 15085

This will give you the counts you are looking for

select p1.mm,p1.yyyy,COUNT(*)
from projs p1
join (select projid,MIN(yyyy*100+mm) as closedOn from projs 
      where stat='c' group by projId) xx 
      on xx.projId=p1.projId and p1.yyyy*100+p1.mm=xx.closedOn
where p1.stat='c' 
group by p1.mm,p1.yyyy

The inner query determines the date the project closed, so you are finding all projects which closed this month...

Upvotes: 1

Related Questions