How to perform a sum for all previous records

Question

I've been trying to implement the solution here with the added flavour of updating existing records. As an MRE I'm looking to populate the sum_date_diff column in a table with the sum of all the differences between the current row date and the date of every previous row where the current row p1_id matches the previous row p1_id or p2_id. I have already filled out the expected result below:

+-----+------------+-------+-------+---------------+
| id_ | date_time  | p1_id | p2_id | sum_date_diff |
+-----+------------+-------+-------+---------------+
|   1 | 2000-01-01 |     1 |     2 | Null          |
|   2 | 2000-01-02 |     2 |     4 | 1             |
|   3 | 2000-01-04 |     1 |     3 | 3             |
|   4 | 2000-01-07 |     2 |     5 | 11            |
|   5 | 2000-01-15 |     2 |     3 | 35            |
|   6 | 2000-01-20 |     1 |     3 | 35            |
|   7 | 2000-01-31 |     1 |     3 | 68            |
+-----+------------+-------+-------+---------------+

My query so far looks like:

UPDATE test.sum_date_diff AS sdd0
        JOIN
    (SELECT 
        id_,
            SUM(DATEDIFF(sdd1.date_time, sq.date_time)) AS sum_date_diff
    FROM
        test.sum_date_diff AS sdd1
    LEFT OUTER JOIN (SELECT 
        sdd2.date_time AS date_time, sdd2.p1_id AS player_id
    FROM
        test.sum_date_diff AS sdd2 UNION ALL SELECT 
        sdd3.date_time AS date_time, sdd3.p2_id AS player_id
    FROM
        test.sum_date_diff AS sdd3) AS sq ON sq.date_time < sdd1.date_time
        AND sq.player_id = sdd1.p1_id
    GROUP BY sdd1.id_) AS master_sq ON master_sq.id_ = sdd0.id_ 
SET 
    sdd0.sum_date_diff = master_sq.sum_date_diff

This works as shown here.

However, on a table of 1.5m records the query has been hanging for the last hour. Even when I add a WHERE clause onto the bottom to restrict the update to a single record then it hangs for 5 mins+.

Here is the EXPLAIN statement for the query on the full table:

+----+-------------+---------------+------------+-------+-----------------------------------------------------------------------------------------------------------------------------------------+-----------------------------------------+---------+-------+---------+----------+--------------------------------------------+
| id | select_type |     table     | partitions | type  |                                                              possible_keys                                                              |                   key                   | key_len |  ref  |  rows   | filtered |                   Extra                    |
+----+-------------+---------------+------------+-------+-----------------------------------------------------------------------------------------------------------------------------------------+-----------------------------------------+---------+-------+---------+----------+--------------------------------------------+
|  1 | UPDATE      | sum_date_diff | NULL       | const | PRIMARY                                                                                                                                 | PRIMARY                                 | 4       | const |       1 |      100 | NULL                                       |
|  1 | PRIMARY     |     | NULL       | ref   |                                                                                                                              |                              | 4       | const |      10 |      100 | NULL                                       |
|  2 | DERIVED     | sum_date_diff | NULL       | index | PRIMARY,ix__match_oc_history__date_time,ix__match_oc_history__p1_id,ix__match_oc_history__p2_id,ix__match_oc_history__date_time_players | ix__match_oc_history__date_time_players | 14      | NULL  | 1484288 |      100 | Using index; Using temporary               |
|  2 | DERIVED     |     | NULL       | ALL   | NULL                                                                                                                                    | NULL                                    | NULL    | NULL  | 2968576 |      100 | Using where; Using join buffer (hash join) |
|  3 | DERIVED     | sum_date_diff | NULL       | index | NULL                                                                                                                                    | ix__match_oc_history__date_time_players | 14      | NULL  | 1484288 |      100 | Using index                                |
|  4 | UNION       | sum_date_diff | NULL       | index | NULL                                                                                                                                    | ix__match_oc_history__date_time_players | 14      | NULL  | 1484288 |      100 | Using index                                |
+----+-------------+---------------+------------+-------+-----------------------------------------------------------------------------------------------------------------------------------------+-----------------------------------------+---------+-------+---------+----------+--------------------------------------------+

Here is the CREATE TABLE statement:

CREATE TABLE `sum_date_diff` (
  `id_` int NOT NULL AUTO_INCREMENT,
  `date_time` datetime DEFAULT NULL,
  `p1_id` int NOT NULL,
  `p2_id` int NOT NULL,
  `sum_date_diff` int DEFAULT NULL,
  PRIMARY KEY (`id_`),
  KEY `ix__sum_date_diff__date_time` (`date_time`),
  KEY `ix__sum_date_diff__p1_id` (`p1_id`),
  KEY `ix__sum_date_diff__p2_id` (`p2_id`),
  KEY `ix__sum_date_diff__date_time_players` (`date_time`,`p1_id`,`p2_id`)
) ENGINE=InnoDB AUTO_INCREMENT=1822120 DEFAULT CHARSET=utf8mb4 COLLATE=utf8mb4_0900_ai_ci

MySQL version is 8.0.26 running on a 2016 MacBook Pro with Monterey with 16Gb RAM.

After reading around about boosting the RAM available to MySQL I've added the following to the standard my.cnf file:

innodb_buffer_pool_size = 8G
tmp_table_size=2G
max_heap_table_size=2G

I'm wondering if:

I've done something wrong
This is just a very slow task no matter what I do
There is a faster method

I'm hoping someone could enlighten me!

How to perform a sum for all previous records

Answers (1)

Related Questions