gvgramazio
gvgramazio

Reputation: 1154

Extract stats from datetime intervals in MySQL

I have a MySQL database with some table stored as innoDB type. One of them is work_session and its structure is:

The table is used to register the time spent for each project. Of course in each row end is bigger than begin. Multiple rows may have the same project_id but intervals denoted by begin and end of rows with the same project_id don't overlap.

My aim is to extract some stats from this table. In particular:

I was able to achieve this using PHP but I want to learn how to do it with MySQL.

I was able to obtain the first stat with

SELECT 
  `project_id`,
  SUM(`work_session_duration`) AS `project_duration`
FROM
  (SELECT 
    `project_id`,
    TIMESTAMPDIFF(SECOND, `begin`, `end`) AS `work_session_duration`
   FROM `work_sessions`) AS `t1`
GROUP BY `project_id`

The other three stats are very similar, in particular the last two, but I'm not able to figure out a way to them with MySQL.

Upvotes: 1

Views: 152

Answers (2)

Paul Spiegel
Paul Spiegel

Reputation: 31792

I think the hourly statistic is the most difficult part. So i will only provide a solution for it. Other questions can be solved in a similar way.

You will need some helper tables:

calendar

|    date    |
|------------|
| 1970-01-01 |
| 1970-01-02 |
| ...        |
| 1940-12-30 |
| 1940-12-31 |

hours

| hour |
|------|
|    0 |
|    1 |
|  ... |
|   22 |
|   23 |

You can find many answers on SO, how to create such tables. I will skip that part to keep my answer compact.

With the helper tables you can create a CROSS JOIN in a subquery containing any date-hour combination you need for your data. That subquery can be joined with your table on overlapping. Then you can group by project_id and h.hour and calculate the sum of worked time:

select s.project_id, h.hour,
    sum(timestampdiff(second, greatest(s.begin, h.begin), least(s.end, h.end))) as project_duration
from (    
    select h.hour,
           timestamp(c.date, concat(h.hour, ':00:00')) as `begin`,
           timestamp(c.date, concat(h.hour+1, ':00:00')) as `end`
    from calendar c
    cross join hours h
    where c.date >= (select date(min(`begin`)) from work_sessions)
      and c.date <= (select date(max(`end`))   from work_sessions)
) h
inner join work_sessions s
    on  s.begin < h.end
    and s.end   > h.begin
group by s.project_id, h.hour

Upvotes: 1

Lucas Franco
Lucas Franco

Reputation: 382

Did you try something like this?

SELECT 
  `project_id`,
  SUM(`work_session_duration`) AS `project_duration`,
  `days`
FROM
  (SELECT 
    `project_id`,
    TIMESTAMPDIFF(SECOND, `begin`, `end`) AS `work_session_duration`,
    DATE_FORMAT(`begin`, '%Y-%m-%d') AS `days`
   FROM `work_sessions`) AS `t1`
GROUP BY `project_id`, `days`
ORDER BY `days`

and

SELECT 
  `project_id`,
  SUM(`work_session_duration`) AS `project_duration`,
  HOUR(`begin`) AS `hours`
FROM
  (SELECT 
    `project_id`,
    TIMESTAMPDIFF(SECOND, `begin`, `end`) AS `work_session_duration`,
    `begin`
   FROM `work_sessions`) AS `t1`
GROUP BY `project_id`, `hours`

and

SELECT 
  `project_id`,
  SUM(`work_session_duration`) AS `project_duration`,
  WEEKDAY(`begin`) AS `weekdays`
FROM
  (SELECT 
    `project_id`,
    TIMESTAMPDIFF(SECOND, `begin`, `end`) AS `work_session_duration`,
    `begin`
   FROM `work_sessions`) AS `t1`
GROUP BY `project_id`, `weekdays`

Or something like this...

Upvotes: 0

Related Questions