Azima
Azima

Reputation: 4151

group by year on multiple date columns mysql

I have table as following:

hours  | ... | task_assigned  | task_deadline  | task_completion
----------------------------------------------------------------
123    | ... | 2019-08-01     | -              | -
234    | ... | -              | 2018-08-01     | 2019-08-01
145    | ... | 2017-08-01     | 2017-08-01     | 2018-01-01

I want to calculate total hours for each year, i.e. grouping by year.

Currently I'm only taking into account task_completion field.

If there's no value in task_completion field, the record is not included in SUM calculation.

To elaborate further, say for year 2019, row 1 and 1 both should be considered. Hence the total hours should be 123 + 234 = 357.

And for year 2018, row 2 and 3.

Similarly, for year 2017, row 3.

SELECT YEAR(task_completion) as year, ROUND(SUM(total_hours), 2) as hours 
FROM task
GROUP BY year
HAVING year BETWEEN '$year_from' AND '$year_to'

The resultset:

year  |  hours
--------------------
2017  |  <somevalue>
2018  |  <somevalue>
2019  |  <somevalue>

How can I include other two date fields too?

Upvotes: 1

Views: 542

Answers (3)

Gordon Linoff
Gordon Linoff

Reputation: 1270011

Basically, you want to unpivot the data. I will assume that the - represents a NULL value and your dates are real dates.

select year(dte) as year, sum(total_hours) as hours
from ((select task_assigned as dte, total_hours
       from task
      ) union all
      (select task_deadline, total_hours
       from task
      ) union all
      (select task_completion, total_hours
       from task
      )
     ) d
where dte is not null
group by year(dte)
order by year(dte);

Based on your sample data, the round() is not necessary so I removed it.

If you want to filter for particular years, the filtering should be in a where clause -- so it filters the data before aggregation.

Change the where to:

where year(dte) >= ? and year(dte) <= ?

or:

where dte >= ? and dte <= ?

to pass in the dates.

The ? are for parameter placeholders. Learn how to use parameters rather than munging query strings.

Upvotes: 1

Thorsten Kettner
Thorsten Kettner

Reputation: 94959

This answer is no langer valid with the updated request.

If I understand correctly, you want to use task_assigned if the task_completion is still null. Use COALEASCE for this.

SELECT
  YEAR(COALESCE(task_completion, task_assigned)) as year,
  ROUND(SUM(total_hours), 2) as hours
FROM task
GROUP BY year
HAVING year BETWEEN $year_from AND $year_to
ORDER BY year;

(I don't think you actually want to use task_deadline, too, for how could a task get completed before getting assigned first? If such can occur, then include it in the COALESCE expression. Probably: COALESCE(task_completion, task_assigned, task_deadline)` then.)

Upvotes: 0

Thorsten Kettner
Thorsten Kettner

Reputation: 94959

You want to consider each row once for each of its years. Use UNION to get these years:

select year, round(sum(total_hours), 2) as hours
from
(
  select year(task_assigned) as year, total_hours from task
  union
  select year(task_deadline) as year, total_hours from task
  union
  select year(task_completion) as year, total_hours from task
) years_and_hours
group by year
having year between $year_from and $year_to
order by year;

If you want to consider a row with one year twice or thrice also as often in the sum, then change UNION to UNION ALL.

Upvotes: 2

Related Questions