ZeroStack
ZeroStack

Reputation: 1109

Add missing rows for a unique group set by date in PostgreSQL

I have a table that contains a game_id, category, date column representing a month of activity denoted as the first day of the month date_month, and a total amount.

In some months there are missing categories for a game_id for a date_month, I need to fill these missing rows for the missing unique group set across the whole table within each month for each game.

Take the following example:

CREATE TEMPORARY TABLE activity (
  game_id INT,
  category TEXT, 
  date DATE,
  amount INT
);

INSERT INTO activity (game_id, category, date, amount) VALUES 
  (1, 'Up', '2015-12-01', 9)
, (1, 'Down', '2015-12-01', 12)
-- Left Missing for '2015-12-01
-- Right Missing for '2015-12-01
, (1, 'Up', '2016-01-01', 12)
,  (1, 'Down', '2016-01-01', 4)
,  (1, 'Left', '2016-01-01', 7)
,  (1, 'Right', '2016-01-01', 3)
,  (1, 'Up', '2016-02-01', 3)
,  (1, 'Down', '2016-02-01', 11)
,  (1, 'Left', '2016-02-01', 4)
,  (1, 'Right', '2016-02-01', 8)
,  (1, 'Up', '2016-03-01', 3)
,  (1, 'Down', '2016-03-01', 11)
-- Left Missing for '2016-03-01'
,  (1, 'Right', '2016-03-01', 8)
,  (1, 'Up', '2016-04-01', 3)
,  (1, 'Down', '2016-04-01', 11)
,  (1, 'Left', '2016-04-01', 4)
-- Right Missing for '2016-04-01'
,  (2, 'Up', '2020-12-01', 9)
, (2, 'Down', '2020-12-01', 12)
-- Left Missing for '2020-12-01'
-- Right Missing for '2020-12-01'
, (2, 'Up', '2020-01-01', 12)
,  (2, 'Down', '2020-01-01', 4)
,  (2, 'Left', '2020-01-01', 7)
-- Right Missing for '2020-01-01'
;

In this instance, the following missing values would need to be created with a 0 amount, a game_id can have a different set of date ranges.

(1, 'Left', '2015-12-01', 0)
(1, 'Right', '2015-12-01', 0)
(1, 'Left', '2016-03-01', 0)
(1, 'Right', '2016-04-01', 0)
(2, 'Left', '2020-12-01', 0)
(2, 'Right', '2020-12-01', 0)
(2, 'Right', '2020-01-01', 0)

My attempt so far with the intention of using it in a UNION back onto the primary table. This yields no rows as the missing groups beyond their minimum and maximum date ranges do not get generated.

SELECT
    game_id,
    category,
    generate_series(
        min(date_month),
        max(date_month),
        '1month'
        )::date AS date_month,
    0 as amount
FROM activity
WHERE NOT EXISTS (
    SELECT
        1
    FROM activity
    WHERE game_id=game_id AND category=category AND date_month=date_month
) 
GROUP BY 1,2
ORDER BY game_id, date_month

Upvotes: 2

Views: 814

Answers (1)

forpas
forpas

Reputation: 164089

You must CROSS join the distinct game_id and date combinations of the table to the distinct category of the table and then LEFT join to the table:

SELECT d.game_id, c.category, d.date, COALESCE(a.amount, 0) amount
FROM (SELECT DISTINCT game_id, date FROM activity) d
CROSS JOIN (SELECT DISTINCT category FROM activity) c
LEFT JOIN activity a 
ON a.game_id = d.game_id AND a.date = d.date AND a.category = c.category
ORDER BY d.game_id, d.date

If you want to insert the missing rows in the table:

INSERT INTO activity (game_id, category, date, amount)
SELECT d.game_id, c.category, d.date, 0
FROM (SELECT DISTINCT game_id, date FROM activity) d
CROSS JOIN (SELECT DISTINCT category FROM activity) c
LEFT JOIN activity a 
ON a.game_id = d.game_id AND a.date = d.date AND a.category = c.category
WHERE a.game_id IS NULL

See the demo.

Upvotes: 2

Related Questions