Kramer
Kramer

Reputation: 267

How to calculate monthly average in MySQL given monthly data that spans more than one month?

I have data that spans multiple months and I want to be able to take the average per day and separate it to the appropriate months. For example, say that one data point is 2/9/2010 - 3/8/2010 and the amount is 1500. Then, the query should return 1071.4 for February 2010 and 428.6 for March. I am hoping there is a MySQL statement that will be able to do the computations instead of my PHP logic. Thanks.

EDIT (added table definition): start (datetime), end (datetime), use

EDIT 2: Here is some dummy data



DROP TABLE IF EXISTS `dummy_data`;
CREATE TABLE `dummy_data` (
  `id` int(11) NOT NULL AUTO_INCREMENT,
  `start_date` date NOT NULL,
  `end_date` date NOT NULL,
  `data` double(15,4) NOT NULL,
  PRIMARY KEY (`id`)
) ENGINE=MyISAM AUTO_INCREMENT=8 DEFAULT CHARSET=latin1;

-- ----------------------------
--  Records of `dummy_data`
-- ----------------------------
BEGIN;
INSERT INTO `dummy_data` VALUES ('1', '2010-01-01', '2010-02-02', '200.0000'), ('2', '2010-02-03', '2010-02-25', '250.0000'), ('3', '2010-02-26', '2010-03-08', '300.0000'), ('4', '2010-03-09', '2010-04-12', '210.0000'), ('5', '2010-04-13', '2010-05-10', '260.0000'), ('6', '2010-05-11', '2010-06-15', '310.0000'), ('7', '2010-06-16', '2010-07-20', '320.0000');
COMMIT;

Upvotes: 2

Views: 6380

Answers (3)

pilcrow
pilcrow

Reputation: 58534

This solution handles [start_date, end_date] spans of as small as one (1) day and as large as twelve (12) months, but is incorrect on a span of thirteen (13) or more months:

CREATE TABLE integers (i INT NOT NULL);

INSERT INTO integers VALUES (0), (1), (2), (3), (4), (5), (6), (7), (8), (9);

CREATE VIEW hundreds AS
   SELECT iii.i * 100 + ii.i * 10 + i AS i
     FROM integers i JOIN integers ii JOIN integers iii;

-- We do not have CTEs, so we create a view
CREATE VIEW spans AS
   SELECT id, start_date, DATEDIFF(end_date, start_date) + 1 AS ndays, data
     FROM dummy_data;

   SELECT spans.id,
          month_name,
          spans.data * COUNT(month_name)/spans.ndays AS month_amount
     FROM spans
LEFT JOIN (SELECT id,
                  MONTH(start_date + INTERVAL i DAY) AS month_num,
                  MONTHNAME(start_date + INTERVAL i DAY) AS month_name
             FROM spans
             JOIN hundreds WHERE i < ndays) daybyday
       ON spans.id = daybyday.id
 GROUP BY id, month_name
 ORDER BY id, month_num;

Output looks like this:

+----+------------+---------------+
| id | month_name | month_amount  |
+----+------------+---------------+
|  1 | January    |  187.87878788 | 
|  1 | February   |   12.12121212 | 
|  2 | February   |  250.00000000 | 
|  3 | February   |   81.81818182 | 
...

We use DATEDIFF to determine the number of days represented by a source record. Then, building off an integers table, we can enumerate the month of each day in a particular span. From there it's a matter of SQL aggregation by record id and month_name.

Upvotes: 1

Matthew
Matthew

Reputation: 10444

You should select SUM and group by the MONTH(date) function as follows:

SELECT SUM(value), MONTH(date)
FROM TABLE
GROUP BY MONTH(date)

EDIT: Ooops, I misread the question, revising my answer now!

You'll need to use some more complex TSQL to get the days in the months, find the average and set into each field then present it in a monthly format

update Create a function to give you days in your range such as this one, which I modified from a function created by Michael Baria

CREATE FUNCTION [dbo].[GetDays](@StartDate DATETIME, @EndDate DATETIME)
RETURNS @MonthList TABLE(DayValue tinyint NOT NULL, MonthValue tinyint NOT NULL, YearValue int NOT NULL)
AS
BEGIN
--Variable used to hold each new date value
DECLARE @DateValue DATETIME

--Start with the starting date in the range
SET @DateValue=@StartDate

--Load output table with the month part of each new date
WHILE @DateValue <= @EndDate
BEGIN
    INSERT INTO @MonthList(DayValue, MonthValue,YearValue)
    SELECT DAY(@DateValue), MONTH(@DateValue), YEAR(@DateValue)

    --Move to the next day           
    SET @DateValue=DATEADD(dd,1,@DateValue)
END

--Return results
RETURN
END

GO

Join your table to this function then sum based on the days

SELECT SUM(data/DATEDIFF(dd,startDate,endDate)), M.MonthValue
FROM TABLE
JOIN (SELECT * FROM [dbo].[GetDays] (startDate,endDate)) M 

GROUP BY M.MonthValue

I can clean this up a bit if I get some sample data

Upvotes: 2

Guffa
Guffa

Reputation: 700252

This does separate the amount over the months:

declare @start datetime set @start = '20100209'
declare @end datetime set @end = '20100308'
declare @avg float set @avg = 1500

select
datediff(day, @start, dateadd(day, 1-day(@end), @end)) * @avg / (datediff(day, @start, @end) + 1),
datediff(day, dateadd(day, -day(@end), @end), @end) * @avg / (datediff(day, @start, @end) + 1)

Result:

1071,42857142857    428,571428571429

It gets a bit more complicated though, as you first have to check if the dates are actually in different months, and if the date spans more than two months you need a different approach.

Upvotes: 0

Related Questions