elmera
elmera

Reputation: 21

Grouping MySQL results by 7 day increments

Hoping someone might be able to assist me with this.

Assume I have the table listed below. Hosts can show up multiple times on the same date, usually with different backupsizes.

+------------------+--------------+
| Field            | Type         | 
+------------------+--------------+
| startdate        | date         |
| host             | varchar(255) | 
| backupsize       | float(6,2)   |  
+------------------+--------------+

How could I find the sum total of backupsize for 7 day increments starting with the earliest date, through the last date? I don't mind if the last few days get cut off because they don't fall into a 7 day increment.

Desired output (prefered):

+------------+----------+----------+----------+-----
|Week of     | system01 | system02 | system03 | ...
+------------+----------+----------+----------+-----
| 2014/07/30 | 2343.23  | 232.34   | 989.34   |
+------------+----------+----------+----------+-----
| 2014/08/06 | 2334.7   | 874.13   | 234.90   |
+------------+----------+----------+----------+-----
| ...        | ...      | ...      | ...      |

OR

+------------+------------+------------+------
|Host        | 2014/07/30 | 2014/08/06 | ...
+------------+------------+------------+------
| system01   | 2343.23    | 2334.7     | ...  
+------------+------------+------------+-------
| system02   | 232.34     | 874.13     | ...
+------------+------------+------------+-------
| system03   | 989.34     | 234.90     | ...
+------------+------------+------------+-------
| ...        | ...        | ...        |       

Date format is not a concern, just as long as it gets identified somehow. Also, the order of the hosts is not a concern either. Thanks!

Upvotes: 0

Views: 1277

Answers (3)

elmera
elmera

Reputation: 21

So I was able to determine a solution that fit my needs using a procedure I created by putting together concepts from your recommended solutions as well as some other other solutions I found on this site. The procedure SUM's by 7 day increments as well as does a pivot.

DELIMITER $$
CREATE PROCEDURE `weekly_capacity_by_host`()
BEGIN
SELECT MIN(startdate) into @start_date FROM testtable;

SET @SQL = NULL;
SELECT
  GROUP_CONCAT(DISTINCT
    CONCAT(
      'SUM(if(host=''',host,''', backupsize, 0)) as ''',host,''''
    )
  ) INTO @SQL
FROM testtable;

SET @SQL = CONCAT('SELECT 1 + DATEDIFF(startdate, ''',@start_date,''') DIV 7 AS week_num
  , ''',@start_date,''' + INTERVAL (DATEDIFF(startdate, ''',@start_date,''') DIV 7) WEEK AS week_start,
  ', @SQL,' 
  FROM testtable group by week_num'
);

PREPARE stmt FROM @SQL;
EXECUTE stmt;
DEALLOCATE PREPARE stmt;
END$$
DELIMITER ;

Output appears as follows:

mysql> call weekly_capacity_by_host;
+----------+------------+----------+----------+----------+----------+
| week_num | week_start | server01 | server02 | server03 | server04 |
+----------+------------+----------+----------+----------+----------+
|        1 | 2014-06-11 |  1231.08 |    37.30 |    12.04 |    68.17 |
|        2 | 2014-06-18 |  1230.98 |    37.30 |    11.76 |    68.13 |
|        3 | 2014-06-25 |  1243.12 |    37.30 |     8.85 |    68.59 |
|        4 | 2014-07-02 |  1234.73 |    37.30 |    11.77 |    67.80 |
|        5 | 2014-07-09 |   341.32 |     0.04 |     0.14 |     4.94 |
+----------+------------+----------+----------+----------+----------+
5 rows in set (0.03 sec)

Upvotes: 0

Gordon Linoff
Gordon Linoff

Reputation: 1270301

The simplest way is to get the earliest date and just count the number of days:

select x.minsd + interval floor(datediff(x.minsd, lb.startdate) / 7) day as `Week of`,
       host,
       sum(backupsize)
from listedbelow lb cross join
     (select min(startdate) as minsd from listedbelow lb) x
group by floor(datediff(x.minsd, lb.startdate) / 7)
order by 1;

This produces a form with week of and host on each row. You can pivot the results as you see fit.

Upvotes: 3

Barranka
Barranka

Reputation: 21047

I'll assume that what you want is the sum of bakcupsize grouped by host and that seven-day interval you are talking about.

My solution would be something like this:

  1. You need to define the first date, and then "create" a column with the date you want (the end of the seven-day period)
  2. Then I would group it.

I think temporary tables and little tricks with temp variables are the best way to tackle this, so:

drop table if exists temp_data;
create temporary table temp_data
select a.*
     -- The @d variable will have the date that you'll use later to group the data.
     , @d := case
          -- If the current "host" value is the same as the previous one, then...
           when @host_prev = host then 
               -- ... if @d is not null and is within the seven-day period,
               -- then leave the value of @d intact; in other case, add 7 days to it.
               case
                   when @d is not null or a.startdate <= @d then @d
                   -- The coalesce() function will return the first not null argument
                   -- (just as a precaution)
                   else dateadd(coalesce(@d, a.startdate), interval +7 day)
               end
           -- If the current "host" value is not  the same as the previous one, 
           -- then take the current date (the first date of the "new" host) and add
           -- seven days to it.
           else @d = dateadd(a.startdate, interval +7 day) 
       end as date_group
     -- This is needed to perform the comparisson in the "case" piece above
     , @host_prev := a.host as host2
from
       (select @host_prev = '', @d = null) as init -- Initialize the variables
     , yourtable as a
-- IMPORTANT: This will only work if you order the data properly
order by a.host, a.startdate;
-- Add indexes to the temp table, to make things faster
alter table temp_data
   add index h(host),
   add index dg(date_group)
   -- OPTIONAL: You can drop the "host2" column (it is no longer needed)
   -- , drop column host2
   ;

Now, you can get the grouped data:

select a.host, a.date_group, sum(a.bakcupsize) as backupsize
from temp_data as a
group by a.host, a.date_group;

This will give you the unpivoted data. If you want to build a pivot table with it, I recommend you take a look to this article, and/or read this question and its answers. In short, you'll have to build a "dynamic" sql instruction, prepare a statement with it and execute it.


Of course, if you want to group this by week, there's a simpler approach:

drop table if exists temp_data2;
create temporary table temp_data2
select a.*
     -- The following will give you the end-of-week date
     , dateadd(a.startdate, interval +(6 - weekday(a.startdate)) day) as group_date
from yourtable as a;
alter table temp_data
   add index h(host),
   add index dg(date_group);
select a.host, a.date_group, sum(a.bakcupsize) as backupsize
from temp_data as a
group by a.host, a.date_group;

I leave the pivot part to you.

Upvotes: 0

Related Questions