Kofi Black
Kofi Black

Reputation: 121

MySQL group and aggregate for several consecutive days

My table is designed to contain pets and their activity data, e.g, sleep time and exercise time. The table definition is provided below.

CREATE TABLE IF NOT EXISTS TEST_ACTIVITY(
   pet_id INT(11) UNSIGNED NOT NULL,
   simple_sleep_time INT(11),
   deep_sleep_time INT(11),
   mild_movement_time INT(11),
   moderate_movement_time INT(11),
   severe_movement_time INT(11),
   start_time INT(11),
   date DATE,
   PRIMARY KEY (pet_id, start_time, date)
) ENGINE=INNODB

The following data is inserted to the table.

INSERT INTO TEST_ACTIVITY (pet_id, simple_sleep_time, deep_sleep_time,mild_movement_time, moderate_movement_time, severe_movement_time, start_time, date) 
VALUES 
(100,  1, 1, 1, 1, 1, 0, '2015-03-10'), 
(100,  2, 1, 1, 1, 1, 30, '2015-03-10'),  
(100,  3, 1, 1, 1, 1, 0, '2015-03-11'), 
(100,  4, 1, 1, 1, 1, 30, '2015-03-11'), 
(100,  5, 1, 1, 1, 1, 0, '2015-03-12'), 
(100,  6, 1, 1, 1, 1, 30, '2015-03-12'), 
(100,  7, 1, 1, 1, 1, 0, '2015-03-13'), 
(100,  8, 1, 1, 1, 1, 30, '2015-03-13'),  
(101,  9, 1, 1, 1, 1, 0, '2015-03-10'), 
(101,  10, 1, 1, 1, 1, 30, '2015-03-10'), 
(101,  11, 1, 1, 1, 1, 0, '2015-03-11'), 
(101,  12, 1, 1, 1, 1, 30, '2015-03-11'),
(101,  13, 1, 1, 1, 1, 0, '2015-03-12'),
(101,  14, 1, 1, 1, 1, 30, '2015-03-12'),
(101,  15, 1, 1, 1, 1, 0, '2015-03-13'),
(101,  16, 1, 1, 1, 1, 30, '2015-03-13'),
(102,  17, 1, 1, 1, 1, 0, '2015-03-10'),
(102,  18, 1, 1, 1, 1, 30, '2015-03-10'),
(102,  19, 1, 1, 1, 1, 0, '2015-03-11'),
(102,  20, 1, 1, 1, 1, 30, '2015-03-11'),
(102,  21, 1, 1, 1, 1, 0, '2015-03-12'),
(102,  22, 1, 1, 1, 1, 30, '2015-03-12'),
(102,  23, 1, 1, 1, 1, 0, '2015-03-13'),
(102,  24, 1, 1, 1, 1, 30, '2015-03-13');

select * from TEST_ACTIVITY ORDER BY pet_id, date, start_time;

+--------+-------------------+-----------------+--------------------+------------------------+----------------------+------------+------------+
| pet_id | simple_sleep_time | deep_sleep_time | mild_movement_time | moderate_movement_time | severe_movement_time | start_time | date       |
+--------+-------------------+-----------------+--------------------+------------------------+----------------------+------------+------------+
|    100 |                 1 |               1 |                  1 |                      1 |                    1 |          0 | 2015-03-10 |
|    100 |                 2 |               1 |                  1 |                      1 |                    1 |         30 | 2015-03-10 |
|    100 |                 3 |               1 |                  1 |                      1 |                    1 |          0 | 2015-03-11 |
|    100 |                 4 |               1 |                  1 |                      1 |                    1 |         30 | 2015-03-11 |
|    100 |                 5 |               1 |                  1 |                      1 |                    1 |          0 | 2015-03-12 |
|    100 |                 6 |               1 |                  1 |                      1 |                    1 |         30 | 2015-03-12 |
|    100 |                 7 |               1 |                  1 |                      1 |                    1 |          0 | 2015-03-13 |
|    100 |                 8 |               1 |                  1 |                      1 |                    1 |         30 | 2015-03-13 |
|    101 |                 9 |               1 |                  1 |                      1 |                    1 |          0 | 2015-03-10 |
|    101 |                10 |               1 |                  1 |                      1 |                    1 |         30 | 2015-03-10 |
|    101 |                11 |               1 |                  1 |                      1 |                    1 |          0 | 2015-03-11 |
|    101 |                12 |               1 |                  1 |                      1 |                    1 |         30 | 2015-03-11 |
|    101 |                13 |               1 |                  1 |                      1 |                    1 |          0 | 2015-03-12 |
|    101 |                14 |               1 |                  1 |                      1 |                    1 |         30 | 2015-03-12 |
|    101 |                15 |               1 |                  1 |                      1 |                    1 |          0 | 2015-03-13 |
|    101 |                16 |               1 |                  1 |                      1 |                    1 |         30 | 2015-03-13 |
|    102 |                17 |               1 |                  1 |                      1 |                    1 |          0 | 2015-03-10 |
|    102 |                18 |               1 |                  1 |                      1 |                    1 |         30 | 2015-03-10 |
|    102 |                19 |               1 |                  1 |                      1 |                    1 |          0 | 2015-03-11 |
|    102 |                20 |               1 |                  1 |                      1 |                    1 |         30 | 2015-03-11 |
|    102 |                21 |               1 |                  1 |                      1 |                    1 |          0 | 2015-03-12 |
|    102 |                22 |               1 |                  1 |                      1 |                    1 |         30 | 2015-03-12 |
|    102 |                23 |               1 |                  1 |                      1 |                    1 |          0 | 2015-03-13 |
|    102 |                24 |               1 |                  1 |                      1 |                    1 |         30 | 2015-03-13 |
+--------+-------------------+-----------------+--------------------+------------------------+----------------------+------------+------------+    

I want to first calculate each pet's daily score using this formula: score = SUM(severe_movement_time) + SUM(moderate_movement_time) + SUM(mild_movement_time) + SUM(simple_sleep_time), and then decide each pet's rank based on its score. By using the following query, I can do it for one day, 2015-03-10.

SELECT pet_id, date, score, rank 
FROM 
(
SELECT t.pet_id, t.date, t.score, @prev := @curr, @curr := score, @rank := IF(@prev = @curr, @rank, @rank+1) AS rank 
FROM (
    SELECT pet_id, date, COALESCE(SUM(simple_sleep_time), 0) shallow, 
           COALESCE(SUM(deep_sleep_time), 0) deep, COALESCE(SUM(mild_movement_time), 0) light, 
           COALESCE(SUM(moderate_movement_time), 0) moderate, COALESCE(SUM(severe_movement_time), 0) heavy, 
          (COALESCE(SUM(severe_movement_time), 0) + COALESCE(SUM(moderate_movement_time), 0) + COALESCE(SUM(mild_movement_time), 0) + COALESCE(SUM(simple_sleep_time), 0)) score 
    FROM TEST_ACTIVITY  
    WHERE date = DATE('2015-03-10')
    GROUP BY pet_id 
    ORDER BY score DESC
) t, (SELECT @curr := null, @prev := null, @rank := 0) r ORDER BY score DESC
) x 

My question is how to write a single query to calculate the scores and rankings for several consecutive days, for example, 4 days (start_date=2015-03-10, end_date=2015-03-13). The expected result is listed below.

| pet_id | date       | score | rank |
+--------+------------+-------+------+
|    102 | 2015-03-10 | 41    |    1 |
|    101 | 2015-03-10 | 25    |    2 |
|    100 | 2015-03-10 | 9     |    3 |
|    102 | 2015-03-11 | 45    |    1 |
|    101 | 2015-03-11 | 29    |    2 |
|    100 | 2015-03-11 | 13    |    3 |
|    102 | 2015-03-12 | 49    |    1 |
|    101 | 2015-03-12 | 33    |    2 |
|    100 | 2015-03-12 | 17    |    3 |
|    102 | 2015-03-13 | 53    |    1 |
|    101 | 2015-03-13 | 37    |    2 |
|    100 | 2015-03-13 | 21    |    3 |

Upvotes: 0

Views: 58

Answers (1)

toonice
toonice

Reputation: 2236

Please try the following...

SELECT pet_id,
       date,
       score,
       rank 
FROM 
(
    SELECT pet_id,
           date,
           score,
           @prevDate := COALESCE( @currDate,
                                  CURDATE() ) as prevDate,
           @currDate := date,
           @prevScore := COALESCE( @currScore,
                                   -1 ) AS prevScore,
           @currScore := score,
           @rank := IF( @prevDate <> @currDate,
                        1,
                        IF( @prevScore = @currScore,
                            @rank,
                            @rank + 1 ) ) AS rank
    FROM
    (
        SELECT pet_id,
               date,
               ( COALESCE( SUM( severe_movement_time ),
                           0 ) +
                     COALESCE( SUM( moderate_movement_time ),
                               0 ) +
                     COALESCE( SUM( mild_movement_time ),
                               0 ) +
                     COALESCE( SUM( simple_sleep_time ),
                               0 ) ) AS score
        FROM TEST_ACTIVITY  
        WHERE date BETWEEN DATE( '2015-03-10' ) AND DATE( '2015-03-13' )
        GROUP BY pet_id,
                 date
    ) AS scoreFinder, ( SELECT @currDate := NULL,
                               @prevDate := NULL,
                               @currScore := NULL,
                               @prevScore := NULL,
                               @rank := 0 ) AS r
    ORDER BY date,
             score DESC
) AS rankFinder;

I based my Answer on the code that you gave us. My first modification was to remove the calculations of shallow, deep, etc., as these are never referenced again after their calculation.

My next modification was to change the inner-most query's WHERE clause to use the BETWEEN operator to define the date range that records to be considered must come from. I have used the explicit values from your Question, but you can of course use other values or variables holding those values.

Since we are interested in each pet_id's score for each date, I expanded this lists grouping clause to GROUP BY pet_id and date pairings.

Since no sorting of the list is needed at this point I removed the ORDER BY clause.

I dislike ultrashort aliases as the more complex a statement the easier it is to lose track of what they represent and make mistakes, either when coding or debugging, which is why I changed t to scoreFinder (I try to use short but descriptive aliases). You are of course free to call it pretty much whatever you like.

This list joined with valuesInitialiser forms the basis of the middle-most query, which starts by ordering this base by date, and subsorting on score.

With a sorted list now available computation of the rank can begin. Since a record's rank is dependent on both its date and its score in comparison to those of the previous record, the current and previous values of each must be tracked. As the more dominant field in the sorting, the values for date will need to be compared first. If they are different then the first record of a new date is being examined, which means that the value of rank will need to be (re)initialised to 1. If the two dates are the same, then a change in score will need to be tested. If the score has not changed then the current value of rank will need to be used. Otherwise the current value of rank will need to be incremented and the new value used.

With the values of rank generated, all that remains is to select just the desired fields.

To test this statement I ran it against your supplied data, which produced the desired results. However, I noticed that the scores from the sample data were all different, so I changed one of the records from ( 101, 12, 1, 1, 1, 1, 30, '2015-03-11' ) to ( 101, 12, 1, 1, 1, 17, 30, '2015-03-11' ) so that the score for that pet_id and that date matched that for pet_id = 102 for that date. I than retested my statement, which as hoped produced the same rank for both pet_id's.

If you have any questions or comments, then please feel free to post a Comment accordingly.

Upvotes: 1

Related Questions