Ale
Ale

Reputation: 115

Counting records based on conditions over potential null values

I have this tables:

CREATE TABLE customer_orders (
  "order_id" INTEGER,
  "customer_id" INTEGER,
  "pizza_id" INTEGER,
  "exclusions" VARCHAR(4),
  "extras" VARCHAR(4),
  "order_time" TIMESTAMP
);
INSERT INTO customer_orders
  ("order_id", "customer_id", "pizza_id", "exclusions", "extras", "order_time")
VALUES
  ('1', '101', '1', '', '', '2020-01-01 18:05:02'),
  ('2', '101', '1', '', '', '2020-01-01 19:00:52'),
  ('3', '102', '1', '', '', '2020-01-02 23:51:23'),
  ('3', '102', '2', '', NULL, '2020-01-02 23:51:23'),
  ('4', '103', '1', '4', '', '2020-01-04 13:23:46'),
  ('4', '103', '1', '4', '', '2020-01-04 13:23:46'),
  ('4', '103', '2', '4', '', '2020-01-04 13:23:46'),
  ('5', '104', '1', NULL, '1', '2020-01-08 21:00:29'),
  ('6', '101', '2', NULL, NULL, '2020-01-08 21:03:13'),
  ('7', '105', '2', NULL, '1', '2020-01-08 21:20:29'),
  ('8', '102', '1', NULL, NULL, '2020-01-09 23:54:33'),
  ('9', '103', '1', '4', '1, 5', '2020-01-10 11:22:59'),
  ('10', '104', '1', NULL, NULL, '2020-01-11 18:34:49'),
  ('10', '104', '1', '2, 6', '1, 4', '2020-01-11 18:34:49');

CREATE TABLE runner_orders (
  "order_id" INTEGER,
  "runner_id" INTEGER,
  "pickup_time" VARCHAR(19),
  "distance" DECIMAL(5,2) NULL,
  "duration" VARCHAR(10),
  "cancellation" VARCHAR(23)
);
INSERT INTO runner_orders
  ("order_id", "runner_id", "pickup_time", "distance", "duration", "cancellation")
VALUES
  ('1', '1', '2020-01-01 18:15:34', '20', '32', ''),
  ('2', '1', '2020-01-01 19:10:54', '20', '27', ''),
  ('3', '1', '2020-01-03 00:12:37', '13.4', '20', NULL),
  ('4', '2', '2020-01-04 13:53:03', '23.4', '40', NULL),
  ('5', '3', '2020-01-08 21:10:57', '10', '15', NULL),
  ('6', '3', NULL, NULL, NULL, 'Restaurant Cancellation'),
  ('7', '2', '2020-01-08 21:30:45', '25', '25mins', NULL),
  ('8', '2', '2020-01-10 00:15:02', '23.4', '15', NULL),
  ('9', '2', NULL, NULL, NULL, 'Customer Cancellation'),
  ('10', '1', '2020-01-11 18:50:20', '10', '10', NULL);

And I they gave me the question: For each customer, how many delivered pizzas had at least 1 change and how many had no changes?

Now, I tried to do this:

WITH cte_1
AS (
    SELECT co.customer_id
        , co.order_id
        , co.exclusions
        , isnull(co.exclusions,'') AS exc
        , isnull(co.extras,'') AS ext
    FROM customer_orders as co
    INNER JOIN runner_orders as ru 
        ON co.order_id = ru.order_id
    WHERE NOT ru.cancellation IN ('Restaurant Cancellation', 'Customer Cancellation')    
)    
SELECT customer_id
    , COUNT(order_id) AS no_changes 
FROM cte_1
WHERE exc = '' AND ext = ''
GROUP BY customer_id

As you can see this so long but at least works and the numbers are correct. But then when I try to do the same to get the result for pizzas with changes like this:

WITH cte_1
AS (
    SELECT co.customer_id
        , co.order_id
        , co.exclusions
        , isnull(co.exclusions,'') AS exc
        , isnull(co.extras,'') AS ext
    FROM customer_orders as co
    INNER JOIN runner_orders as ru 
        ON co.order_id = ru.order_id
    WHERE NOT ru.cancellation IN ('Restaurant Cancellation', 'Customer Cancellation')    
)    
SELECT customer_id
    , COUNT(order_id) AS with_changes 
FROM cte_1
WHERE not exc = '' 
GROUP BY customer_id
UNION
SELECT customer_id
    , COUNT(order_id) AS with_changes 
FROM cte_1
WHERE not ext = '' 
GROUP BY customer_id

It doesn't work, and I'm not sure if it's because of the Nulls and blanks or the approach in general. Please if there is a better and shorter way to do this.

EXPECTED RESULT would be something like:

customer_id orders_with_changes orders_with_NO_changes
101 --- 2
102 --- 3
103 3 ---
104 2 1
105 1 ---

Upvotes: 1

Views: 77

Answers (1)

lemon
lemon

Reputation: 15482

You can try approaching this problem by:

  • filtering on the runners table with the NOT EXISTS operator, to check and discard the orders that were subject to alteration by third-party, before any aggregation is carried out
  • applying conditional aggregation with COUNT + CASE expression, by counting records that satisfy conditions (orders without changes are the ones that have both extras and exclusions null, while order with changes are identified by the opposite condition)
SELECT customer_id, 
       COUNT(CASE WHEN NOT COALESCE(extras, '') = '' 
                    OR NOT COALESCE(exclusions, '') = '' THEN 1 END) AS num_with_changes,
       COUNT(CASE WHEN COALESCE(extras, '') = ''
                   AND COALESCE(exclusions, '') = '' THEN 1 END) AS num_with_no_changes
FROM customer_orders co
WHERE NOT EXISTS(SELECT 1 
                 FROM runner_orders ro 
                 WHERE co.order_id = ro.order_id 
                   AND NOT COALESCE(cancellation, '') = '')
GROUP BY customer_id

Output:

customer_id num_with_changes num_with_no_changes
101 0 2
102 0 3
103 3 0
104 2 1
105 1 0

Check the demo here.

Upvotes: 3

Related Questions