Reputation: 33
I am working in MySQL. Initially I had a table that looks like this:
My task is to calculate the percentage of bookings that were cancelled for people that were in a long waiting list and for people that had short waiting. After some manipulations I came up with the following code:
SELECT
CASE WHEN days_in_waiting_list > (SELECT AVG(days_in_waiting_list) FROM Bookings) THEN 'Long wait'
ELSE 'Short wait'
END AS waiting,
is_canceled, COUNT(*), count(*) * 100.0 / sum(count(*)) over() AS perc_cancelled
FROM Bookings
GROUP BY waiting, is_canceled;
The resulting table:
But I want the percentages to be calculated for the category, not the whole table. So that the sum of percentages in Short wait was equal to 100, the same is for Long wait. I want it to be like this:
waiting | is_cancelled | perc |
---|---|---|
Short wait | 0 | 0.61 |
Short wait | 1 | 0.39 |
Long wait | 0 | 0.32 |
Long wait | 1 | 0.68 |
Is there a way to do this? I know that it is possible using over(partition by waiting), but it gives me the error
Error Code: 1054. Unknown column 'waiting' in 'window partition by'
Upvotes: 2
Views: 1423
Reputation: 3210
Creating the table with only the relevant data, and 40 random rows :
CREATE TABLE test (
id INT AUTO_INCREMENT PRIMARY KEY,
is_canceled INT DEFAULT FLOOR(RAND() * 2),
days_in_waiting_list INT DEFAULT FLOOR(RAND() * 100)
);
INSERT INTO test () VALUES (), (), (), (), (), (), (), (), (), ();
INSERT INTO test () VALUES (), (), (), (), (), (), (), (), (), ();
INSERT INTO test () VALUES (), (), (), (), (), (), (), (), (), ();
INSERT INTO test () VALUES (), (), (), (), (), (), (), (), (), ();
Your were really close, just add PARTITION BY waiting
:
SELECT
CASE WHEN days_in_waiting_list > (
SELECT AVG(days_in_waiting_list)
FROM test
) THEN 'Long' ELSE 'Short' END AS waiting,
is_canceled,
COUNT(*),
COUNT(*) * 100.0 / SUM(COUNT(*)) OVER (PARTITION BY waiting)
FROM test
GROUP BY waiting, is_canceled;
waiting | is_canceled | count(*) | count() / sum(count()) over (partition by waiting) |
---|---|---|---|
Long | 0 | 10 | 58.82353 |
Long | 1 | 7 | 41.17647 |
Short | 0 | 15 | 65.21739 |
Short | 1 | 8 | 34.78261 |
Upvotes: 1
Reputation: 222482
I would use window functions to compute the average, then aggregation:
select waiting, is_cancelled,
count(*) / sum(count(*)) over(partition by waiting) as ratio
from (
select b.*,
case when days_in_waiting_list > avg(days_in_waiting_list) over()
then 'Long Wait'
else 'Short wait'
end as waiting
from bookings b
) b
group by waiting, is_cancelled
order by waiting, is_cancelled
Upvotes: 1
Reputation: 701
WITH cte AS (
SELECT
CASE WHEN days_in_waiting_list > (SELECT AVG(days_in_waiting_list) FROM Bookings)
THEN 'Long wait'
ELSE 'Short wait'
END AS waiting,
is_canceled,
COUNT(*) as subTotal,
sum(count(*) over (partition by waiting ORDER BY is_canceled ASC) as totalSum,
FROM Bookings
GROUP BY waiting, is_canceled;
)
SELECT waiting,
is_canceled,
subTotal* (100.0)/totalSum as percentage
FROM cte
group by waiting,is_canceled
Upvotes: 1
Reputation: 11
you want the percentages to be calculated per category, thus grouping by waiting category is needed for perc_cancelled
count() * 100.0 / sum(count()) over(partition by waiting) AS perc_cancelled
Upvotes: 1