Steve
Steve

Reputation: 13

Count query for data in 2 separate tables

I'm trying to count how many jobs have more than one task associated with them. There is a table for Jobs and a separate table for tasks. These are linked as JOBS.ID = TASKS.WORKID.

I'm using a select count(*) statement to count where the sequence of tasks is greater than one (eg multiple tasks).

Jobs

TABLE - JOBS

tasks

TABLE - TASKS

the statement is:

SELECT MONTH(Date) AS statsMonth, YEAR(Date) AS theYear, QuarterNumber,
(SELECT COUNT(*) AS Expr1
  FROM JOB LEFT JOIN TASK ON TASK.workId = JOB.id
  WHERE (TASK.CatId in ('1','2')) AND (MONTH(JOB.LoggedDate) = MONTH(internal_dates.Date)) AND (YEAR(JOB.LoggedDate) = YEAR(internal_dates.Date)) HAVING COUNT(TASK.sequence) > 1) AS "XX"
FROM            internal_dates
WHERE date BETWEEN '2019-12-30 05:03:25' AND '2020-03-30 05:03:25'
GROUP BY MONTH(Date), YEAR(Date), QuarterNumber
ORDER BY theYear desc, statsMonth desc, QuarterNumber desc;

however when I run this query against the data shown I get a count of 2 for statsMonth 3, year 2020

as can be seen from the data, there is only 1 JOB with more than one task eg JOB = 2

I need to report on the number of JOBS that have more than 1 TASK associated with them...

Upvotes: 1

Views: 39

Answers (1)

Gordon Linoff
Gordon Linoff

Reputation: 1269953

You are not counting the number of jobs with more than one task. For that, you need two levels of aggregation. The inner subquery summarizes by the job. The outer one counts the one with more than one task:

SELECT MONTH(Date) AS statsMonth, YEAR(Date) AS theYear, QuarterNumber,
       (SELECT COUNT(*)
        FROM (SELECT COUNT(*) as cnt
              FROM JOB j JOIN
                   TASK t
                   ON t.workId = j.id
              WHERE t.CatId IN (1, 2) AND
                    MONTH(j.LoggedDate) = MONTH(id.Date) AND
                    YEAR(j.LoggedDate) = YEAR(id.Date)
              GROUP BY j.id
              HAVING COUNT(TASK.sequence) > 1
             ) j
      )
FROM internal_dates id
WHERE id.date BETWEEN '2019-12-30 05:03:25' AND '2020-03-30 05:03:25'
GROUP BY MONTH(id.Date), YEAR(id.Date), QuarterNumber
ORDER BY theYear desc, statsMonth desc, QuarterNumber desc;

Note that table aliases make the query easier to write and to read.

Upvotes: 1

Related Questions