WagnerMatosUK
WagnerMatosUK

Reputation: 4429

How to get Postgres to return 0 for empty rows

I have a query which get data summarised between two dates like so:

SELECT date(created_at),
COUNT(COALESCE(id, 0))                                                                      AS total_orders,
   SUM(COALESCE(total_price, 0))                                                               AS total_price,
   SUM(COALESCE(taxes, 0))                                                                     AS taxes,
   SUM(COALESCE(shipping, 0))                                                                  AS shipping,
   AVG(COALESCE(total_price, 0))                                                               AS average_order_value,
   SUM(COALESCE(total_discount, 0))                                                            AS total_discount,
   SUM(total_price - COALESCE(taxes, 0) - COALESCE(shipping, 0) - COALESCE(total_discount, 0)) as net_sales
FROM orders
WHERE shop_id = 43
  AND orders.active = true
  AND orders.created_at >= '2022-07-20'
  AND orders.created_at <= '2022-07-26'
GROUP BY date (created_at)
order by created_at::date desc

However for dates that do not have any orders, the query returns nothing and I'd like to return 0.

I have tried with COALESCE but that doesn't seem to do the trick?

Any suggestions?

Upvotes: 1

Views: 1289

Answers (3)

Erwin Brandstetter
Erwin Brandstetter

Reputation: 656301

This should be substantially faster - and correct:

SELECT *
     , total_price - taxes - shipping - total_discount AS net_sales  -- ⑤
FROM  (
   SELECT created_at
        , COALESCE(total_orders        , 0) AS total_orders
        , COALESCE(total_price         , 0) AS total_price
        , COALESCE(taxes               , 0) AS taxes
        , COALESCE(shipping            , 0) AS shipping
        , COALESCE(average_order_value , 0) AS average_order_value
        , COALESCE(total_discount      , 0) AS total_discount
   FROM   generate_series(timestamp '2022-07-20'  -- ①
                        , timestamp '2022-07-26'
                        , interval '1 day') AS g(created_at)
   LEFT  JOIN (  -- ③
      SELECT created_at::date
           , count(*)            AS total_orders  -- ⑥
           , sum(total_price)    AS total_price
           , sum(taxes)          AS taxes
           , sum(shipping)       AS shipping
           , avg(total_price)    AS average_order_value
           , sum(total_discount) AS total_discount
      FROM   orders
      WHERE  shop_id = 43
      AND    active  -- simpler
      AND    created_at >= '2022-07-20'
      AND    created_at <  '2022-07-27'  -- ② !
      GROUP  BY 1
      ) o USING (created_at)  -- ④
   ) sub
ORDER  BY created_at DESC;

db<>fiddle here

I copied, simplified, and extended Xu's fiddle for comparison.

① Why this particular form for generate_series()? See:

② Assuming created_at is data type timestamp your original formulation is most probably incorrect. created_at <= '2022-07-26' would include the first instant of '2022-07-26' and exclude the rest. To include all of '2022-07-26', use created_at < '2022-07-27'. See:

③ The LEFT JOIN is the core feature of this answer. Generate all days with generate_series(), independently aggregate days from table orders, then LEFT JOIN to retain one row per day like you requested.

④ I made the column name match created_at, so we can conveniently shorten the join syntax with the USING clause.

⑤ Compute net_sales in an outer SELECT after replacing NULL values, so we need COALESCE() only once.

count(*) is equivalent to COUNT(COALESCE(id, 0)) in any case, but cheaper. See:

Upvotes: 2

Courser Xu
Courser Xu

Reputation: 142

Please refer to the below script.

SELECT *
FROM 
    (SELECT date(created_at) AS created_at,
         COUNT(id) AS total_orders,
         SUM(total_price) AS total_price,
         SUM(taxes) AS taxes,
         SUM(shipping) AS shipping,
         AVG(total_price) AS average_order_value,
         SUM(total_discount) AS total_discount,
         SUM(total_price - taxes - shipping - total_discount) AS net_sales
    FROM orders
    WHERE shop_id = 43
        AND orders.active = true
        AND orders.created_at >= '2022-07-20'
        AND orders.created_at <= '2022-07-26'
    GROUP BY  date (created_at)
UNION
SELECT dates AS created_at,
         0 AS total_orders,
         0 AS total_price,
         0 AS taxes,
         0 AS shipping,
         0 AS average_order_value,
         0 AS total_discount,
         0 AS net_sales
FROM generate_series('2022-07-20', '2022-07-26', interval '1 day') AS dates
WHERE dates NOT IN 
    (SELECT created_at
    FROM orders
    WHERE shop_id = 43
        AND orders.active = true
        AND orders.created_at >= '2022-07-20'
        AND orders.created_at <= '2022-07-26' ) ) a
ORDER BY  created_at::date desc;

There is one sample for your reference. Sample

I got your duplicate test cases at my side. The root cause is created_at field (datattype:timestamp), hence there are duplicate lines.

Below script is correct for your request.

SELECT *
FROM 
    (SELECT date(created_at) AS created_at,
         COUNT(id) AS total_orders,
         SUM(total_price) AS total_price,
         SUM(taxes) AS taxes,
         SUM(shipping) AS shipping,
         AVG(total_price) AS average_order_value,
         SUM(total_discount) AS total_discount,
         SUM(total_price - taxes - shipping - total_discount) AS net_sales
    FROM orders
    WHERE shop_id = 43
        AND orders.active = true
        AND orders.created_at >= '2022-07-20'
        AND orders.created_at <= '2022-07-26'
    GROUP BY  date (created_at)
UNION
SELECT dates AS created_at,
         0 AS total_orders,
         0 AS total_price,
         0 AS taxes,
         0 AS shipping,
         0 AS average_order_value,
         0 AS total_discount,
         0 AS net_sales
FROM generate_series('2022-07-20', '2022-07-26', interval '1 day') AS dates
WHERE dates NOT IN 
    (SELECT date (created_at)
    FROM orders
    WHERE shop_id = 43
        AND orders.active = true
        AND orders.created_at >= '2022-07-20'
        AND orders.created_at <= '2022-07-26' ) ) a
ORDER BY  created_at::date desc;

Here is a sample that's same with your side. Link

Upvotes: 1

Siddheshwar Soni
Siddheshwar Soni

Reputation: 430

You can use WITH RECURSIVE to build a table of dates and then select dates that are not in your table

WITH RECURSIVE t(d) AS (
  (SELECT '2015-01-01'::date)
UNION ALL
  (SELECT d + 1 FROM t WHERE d + 1 <= '2015-01-10')
) SELECT d FROM t WHERE d NOT IN (SELECT d_date FROM tbl);

[look on this post : ][1]

[1]: https://stackoverflow.com/questions/28583379/find-missing-dates-postgresql#:~:text=You%20can%20use%20WITH%20RECURSIVE,SELECT%20d_date%20FROM%20tbl)%3B

Upvotes: 0

Related Questions