Alex
Alex

Reputation: 2299

PHP Summing amounts when overlapping date ranges are occuring

I have a MySQL table that looks like this:

record_id    amount    start_date    end_date
1            20000     2018-01-01   2018-12-01
1            -15000    2018-02-01   2018-04-01
1            50000     2018-04-02   2018-06-30

This gets translated in PHP to an array that looks like this:

[1] => [
    [
       "start_date" => "2018-01-01",
       "end_date" => "2018-12-01",
       "amount"   => 20000
    ],
    [
       "start_date" => "2018-02-01",
       "end_date" => "2018-04-01",
       "amount"   => -15000
    ],
    [
       "start_date" => "2018-04-02",
       "end_date" => "2018-06-30",
       "amount"   => 50000
    ],
]

The problem is, I need to sum overlapping amounts in the date ranges, and keep an original amount for the longest date range. So the resulting MySQL table would look like this:

record_id    amount   start_date    end_date
1            20000    2018-01-01    2018-01-30
1            5000     2018-02-01    2018-04-01 
1            70000    2018-04-02    2018-06-30
1            20000    2018-07-01    2018-12-01

And the resulting PHP array would look like this:

[1] => [
    [
       "start_date" => "2018-01-01",
       "end_date" => "2018-01-30",
       "amount"   => 20000
    ],
    [
       "start_date" => "2018-02-01",
       "end_date" => "2018-04-01",
       "amount"   => 5000
    ],
    [
       "start_date" => "2018-04-02",
       "end_date" => "2018-06-30",
       "amount"   => 70000
    ],
    [
       "start_date" => "2018-07-01",
       "end_date" => "2018-12-01",
       "amount"   => 20000 
    ],
]

Basically, the amount from the first start date to the day before the next start date is the current amount. Then for any overlapping date range, the amount gets summed. So 20,000 + -15,000 = 5,000. Then, for the date range 2018-04-02 to 2018-06-30, the initial amount (still valid since it runs through december) will be added to the 50,000, making it 70,000. Finally, the end date is still running so we make a new entry for 20,000.

I could make pseudo-code for a single case like this, but these cases are all dynamic, and the overlapping dates and ranges could change.

Apparently, it's impossible / very hard to do this in MySQL, so I was wondering if anyone had experience doing these overlapping date ranges in PHP.

Edit: I've marked strawberry's answer as correct as it works. If possible, he can paste this with his answer and I can remove this edit, but there is some work to be done before his answer works (pre mysql 8.0). You can run this script to get his answer to work, before doing anything else (source is slightly modified from this version):

DROP TABLE IF EXISTS calendar;
CREATE TABLE calendar(
        id                      INTEGER PRIMARY KEY,  -- year*10000+month*100+day
        dt                      DATE NOT NULL,
        year                    INTEGER NOT NULL,
        month                   INTEGER NOT NULL, -- 1 to 12
        day                     INTEGER NOT NULL, -- 1 to 31
        quarter                 INTEGER NOT NULL, -- 1 to 4
        week                    INTEGER NOT NULL, -- 1 to 52/53
        day_name                VARCHAR(9) NOT NULL, -- 'Monday', 'Tuesday'...
        month_name              VARCHAR(9) NOT NULL, -- 'January', 'February'...
        holiday_flag            CHAR(1) DEFAULT 'f' CHECK (holiday_flag in ('t', 'f')),
        weekend_flag            CHAR(1) DEFAULT 'f' CHECK (weekend_flag in ('t', 'f')),
        event                   VARCHAR(50),
        UNIQUE td_ymd_idx (year,month,day),
        UNIQUE td_dt_idx (dt)

) Engine=MyISAM;

DROP PROCEDURE IF EXISTS fill_date_dimension;
DELIMITER //
CREATE PROCEDURE fill_date_dimension(IN startdate DATE,IN stopdate DATE)
BEGIN
    DECLARE currentdate DATE;
    SET currentdate = startdate;
    WHILE currentdate < stopdate DO
        INSERT INTO calendar VALUES (
                        YEAR(currentdate)*10000+MONTH(currentdate)*100 + DAY(currentdate),
                        currentdate,
                        YEAR(currentdate),
                        MONTH(currentdate),
                        DAY(currentdate),
                        QUARTER(currentdate),
                        WEEKOFYEAR(currentdate),
                        DATE_FORMAT(currentdate,'%W'),
                        DATE_FORMAT(currentdate,'%M'),
                        'f',
                        CASE DAYOFWEEK(currentdate) WHEN 1 THEN 't' WHEN 7 then 't' ELSE 'f' END,
                        NULL);
        SET currentdate = ADDDATE(currentdate,INTERVAL 1 DAY);
    END WHILE;
END
//
DELIMITER ;

TRUNCATE TABLE calendar;

CALL fill_date_dimension('1-01-01','2040-01-01');
OPTIMIZE TABLE calendar;

Upvotes: 0

Views: 70

Answers (1)

Strawberry
Strawberry

Reputation: 33935

Here's a solution that uses a simple calendar(dt) utility table...

DROP TABLE IF EXISTS my_table;

CREATE TABLE my_table
(start_date DATE NOT NULL PRIMARY KEY
,end_date DATE NOT NULL
,amount INT NOT NULL
);

INSERT INTO my_table VALUES
('2018-01-01','2018-12-01', 20000),
('2018-02-01','2018-04-01',-15000),
('2018-04-02','2018-06-30', 50000);

SELECT MIN(dt) range_start
     , MAX(dt) range_end
     , MIN(amount) amount
  FROM
     ( 
     SELECT dt
     , amount
     , CASE WHEN @prev=amount THEN @i:=@i ELSE @i:=@i+1 END i
     , @prev:=amount
  FROM 
     ( SELECT x.*
            , SUM(y.amount) amount 
         FROM calendar x 
         JOIN my_table y 
           ON x.dt BETWEEN y.start_date AND y.end_date 
        GROUP 
           BY x.dt 
      ) a
   JOIN (SELECT @prev:=null,@i:=0) vars
 ORDER 
    BY dt
   ) n
   GROUP 
     BY i;

+-------------+------------+--------+
| range_start | range_end  | amount |
+-------------+------------+--------+
| 2018-01-01  | 2018-01-31 |  20000 |
| 2018-02-01  | 2018-04-01 |   5000 |
| 2018-04-02  | 2018-06-30 |  70000 |
| 2018-07-01  | 2018-12-01 |  20000 |
+-------------+------------+--------+

Upvotes: 2

Related Questions