GLaDOS
GLaDOS

Reputation: 672

How do I subtract aggregated counts based on different states two columns in one table?

Using MariaDB, I am trying to get a monthly total of items that were created minus items that were deleted that month, for each month. If no items were deleted, the total should be just the number of items that were created that month. If more items were deleted than created, the total should be a negative number.

The table has a created_at column which is never null and a deleted_at column which is set once the item has been 'deleted'

To illustrate, the (simplified) schema is like this: TABLE Items:

+----------------------------------------------------------------------------+
|   idItem  | created_at                     | deleted_at                    |
+----------------------------------------------------------------------------+
|  1        |  2020-03-20T04:28:41.000+00:00 | 2021-07-27T02:36:05.000+00:00 |
|  2        |  2020-03-20T04:28:41.000+00:00 | 2021-07-27T02:36:05.000+00:00 |
|  3        |  2021-03-02T21:39:10.000+00:00 |  ∅                            |
|  4        |  2021-03-05T21:13:13.000+00:00 |  ∅                            |
|  5        |  2021-06-08T13:49:11.000+00:00 | 2021-07-27T02:36:05.000+00:00 |
|  6        |  2021-07-13T02:36:05.000+00:00 |  ∅
|  7        |  2021-09-17T21:12:13.000+00:00 |  ∅                            |
+----------------------------------------------------------------------------+

The information I need is the monthly total that have not been deleted, like so:

+-----------------------------------+
|   total_existing  | during_month  |     
+-----------------------------------+
|    2              | 2020-03       | -- two were added
+-----------------------------------+
|    4              | 2021-03       | -- another two were created
+-----------------------------------+
|    5              | 2021-06       | -- another was added
+-----------------------------------+
|    3              | 2021-07       | -- three deleted, one added
+-----------------------------------+
|    4              | 2021-09       | -- one added
+-----------------------------------+

Ultimately, I need to display the total for each month. I've tried this but it's not right.

SELECT
  count(created.idItem) AS monthly_created_count,
  count(deleted.idItem) AS monthly_deleted_count,
  count(created.idItem) - count(deleted.idItem) as total,
  DATE_FORMAT(created.created_at, '%Y-%m') as created_month ,
  DATE_FORMAT(deleted.deleted_at, '%Y-%m') as deleted_month 
FROM 
    Item created 
  LEFT JOIN 
    Item deleted 
  ON 
  DATE_FORMAT(deleted.deleted_at, '%Y-%m') = DATE_FORMAT(created.created_at, '%Y-%m')
GROUP BY DATE_FORMAT(created.created_at, '%Y-%m'), DATE_FORMAT(deleted.deleted_at, '%Y-%m')

I keep thinking I'm so close, but when we look at the rows where the deleted_at dates are set, it's obvious I'm off the mark.

Upvotes: 0

Views: 72

Answers (1)

SOS
SOS

Reputation: 6550

If you're looking for a cumulative total of rows created/deleted, one approach is COUNT the number of records created and deleted by month/year separately. Then join the counts together with UNION ALL and calculate the sum totals:

SELECT t.YearMonth
      , SUM(t.TotalCreated) - SUM(t.TotalDeleted) AS TotalExisting
FROM  (
         SELECT DATE_FORMAT(created_at, '%Y-%m') AS YearMonth
                , COUNT(*) AS TotalCreated
                , 0 AS TotalDeleted
         FROM   Item
         GROUP BY DATE_FORMAT(created_at, '%Y-%m')
         
         UNION ALL

         SELECT DATE_FORMAT(deleted_at, '%Y-%m') AS YearMonth
                , 0 AS TotalCreated 
                , COUNT(*) AS TotalDeleted
         FROM   Item
         WHERE  deleted_at IS NOT NULL
         GROUP BY DATE_FORMAT(deleted_at, '%Y-%m')
) t         
GROUP BY t.YearMonth
ORDER BY t.YearMonth

Results:

 YearMonth | TotalExisting
 :-------- | ------------:
 2020-03   |             2
 2021-03   |             2
 2021-06   |             1
 2021-07   |            -2
 2021-09   |             1

Then wrap those statements in a CTE and use a Window Function to calculate the rolling total:

See also db<>fiddle

WITH cte AS (        
    SELECT t.YearMonth
           , SUM(t.TotalCreated) - SUM(t.TotalDeleted) AS TotalExisting
    FROM  (
             SELECT DATE_FORMAT(created_at, '%Y-%m') AS YearMonth
                   , COUNT(*) AS TotalCreated
                   , 0 AS TotalDeleted
             FROM   Item
             GROUP BY DATE_FORMAT(created_at, '%Y-%m')
             UNION ALL
             SELECT DATE_FORMAT(deleted_at, '%Y-%m') AS YearMonth
                    , 0 AS TotalCreated 
                    , COUNT(*) AS TotalDeleted
             FROM   Item
             WHERE  deleted_at IS NOT NULL
             GROUP BY DATE_FORMAT(deleted_at, '%Y-%m')
     ) t         
     GROUP BY t.YearMonth
     ORDER BY t.YearMonth
)
SELECT YearMonth, SUM(TotalExisting) OVER (ORDER BY YearMonth) AS TotalExisting
FROM   cte;

Final Results:

YearMonth | TotalExisting
:-------- | ------------:
2020-03   |             2
2021-03   |             4
2021-06   |             5
2021-07   |             3
2021-09   |             4

Upvotes: 1

Related Questions