Reputation: 31
I am trying to solve the problem in the mentioned link and came up with the following code. But my cancellation rate results are wrong. What is wrong with my code? Problem link: https://leetcode.com/problems/trips-and-users/
Code:
select t1.Day, ifnull(cast((t1.Cancelled/t2.tot_cnt) as decimal(12,2)), 0.00) as Cancellation_Rate from
(select Day, count(Stat) as Cancelled from
(select t.Status as Stat, t.Request_at as Day from Trips t join Users c on t.Client_Id = c.Users_Id
join Users d on t.Driver_Id = d.Users_Id
where c.Banned = 'No' and d.Banned = 'No') a
where Stat in ('cancelled_by_driver', 'cancelled_by_client')
group by Day) t1 join
(select Day, count(Stat) as tot_cnt from
(select t.Status as Stat, t.Request_at as Day from Trips t join Users c on t.Client_Id = c.Users_Id
join Users d on t.Driver_Id = d.Users_Id
where c.Banned = 'No' and d.Banned = 'No') b
group by Day) t2
on t1.Day = t2.Day where t1.Day in ('2013-10-01','2013-10-02','2013-10-03')
DDLs:
CREATE TABLE trips(
Id INTEGER NOT NULL PRIMARY KEY
,Client_Id INTEGER NOT NULL
,Driver_Id INTEGER NOT NULL
,City_Id INTEGER NOT NULL
,Status VARCHAR(19) NOT NULL
,Request_at DATE NOT NULL
);
INSERT INTO trips(Id,Client_Id,Driver_Id,City_Id,Status,Request_at) VALUES (1,1,10,1,'completed','2013-10-01');
INSERT INTO trips(Id,Client_Id,Driver_Id,City_Id,Status,Request_at) VALUES (2,2,11,1,'cancelled_by_driver','2013-10-01');
INSERT INTO trips(Id,Client_Id,Driver_Id,City_Id,Status,Request_at) VALUES (3,3,12,6,'completed','2013-10-01');
INSERT INTO trips(Id,Client_Id,Driver_Id,City_Id,Status,Request_at) VALUES (4,4,13,6,'cancelled_by_client','2013-10-01');
INSERT INTO trips(Id,Client_Id,Driver_Id,City_Id,Status,Request_at) VALUES (5,1,10,1,'completed','2013-10-02');
INSERT INTO trips(Id,Client_Id,Driver_Id,City_Id,Status,Request_at) VALUES (6,2,11,6,'completed','2013-10-02');
INSERT INTO trips(Id,Client_Id,Driver_Id,City_Id,Status,Request_at) VALUES (7,3,12,6,'completed','2013-10-02');
INSERT INTO trips(Id,Client_Id,Driver_Id,City_Id,Status,Request_at) VALUES (8,2,12,12,'completed','2013-10-03');
INSERT INTO trips(Id,Client_Id,Driver_Id,City_Id,Status,Request_at) VALUES (9,3,10,12,'completed','2013-10-03');
INSERT INTO trips(Id,Client_Id,Driver_Id,City_Id,Status,Request_at) VALUES (10,4,13,12,'cancelled_by_driver','2013-10-03');
CREATE TABLE users(
id INTEGER NOT NULL PRIMARY KEY
,name VARCHAR(29) NOT NULL
,amount NUMERIC(6,0) NOT NULL
,Remark VARCHAR(45)
);
INSERT INTO users(id,name,amount,Remark) VALUES (1,'No',client,NULL);
INSERT INTO users(id,name,amount,Remark) VALUES (2,'Yes',client,NULL);
INSERT INTO users(id,name,amount,Remark) VALUES (3,'No',client,NULL);
INSERT INTO users(id,name,amount,Remark) VALUES (4,'No',client,NULL);
INSERT INTO users(id,name,amount,Remark) VALUES (10,'No',driver,NULL);
INSERT INTO users(id,name,amount,Remark) VALUES (11,'No',driver,NULL);
INSERT INTO users(id,name,amount,Remark) VALUES (12,'No',driver,NULL);
INSERT INTO users(id,name,amount,Remark) VALUES (13,'No',driver,NULL);
Upvotes: 0
Views: 699
Reputation: 1
select
request_at as "Day",
round(
(sum(case when status = "cancelled_by_driver" or status = "cancelled_by_client" then 1
else 0 end) / count(status)), 2) as "Cancellation Rate"
from
Trips
where
client_id not in (select users_id from Users where role = 'client' and banned ='Yes')
and
driver_id not in (select users_id from Users where role = 'driver' and banned ='Yes')
and
request_at >= "2013-10-01" and request_at <= "2013-10-03"
group by
request_at
Upvotes: 0
Reputation: 3
Solution:
(SELECT P1.request_at as Day,CAST((Total-Completed*1.00)/Total as Decimal(10,2)) as 'Cancellation Rate' FROM (SELECT request_at ,COUNT(status) as Completed FROM Trips WHERE client_id NOT IN(SELECT users_id FROM Users WHERE banned = 'Yes')
AND driver_id NOT IN(SELECT users_id FROM Users WHERE banned = 'Yes') AND status ='completed' GROUP BY request_at) P1
INNER JOIN
(SELECT request_at ,COUNT(status) as Total FROM Trips WHERE client_id NOT IN(SELECT users_id FROM Users WHERE banned = 'Yes')
AND driver_id NOT IN(SELECT users_id FROM Users WHERE banned = 'Yes') GROUP BY request_at) P2 ON P1.request_at=P2.request_at WHERE P1.request_at>= '2013-10-01' AND P1.request_at<='2013-10-03')
UNION
(SELECT P1.request_at as Day,CAST((Cancelled*1.00)/Total as Decimal(10,2)) as 'Cancellation Rate' FROM (SELECT request_at ,COUNT(status) as Cancelled FROM Trips WHERE client_id NOT IN(SELECT users_id FROM Users WHERE banned = 'Yes')
AND driver_id NOT IN(SELECT users_id FROM Users WHERE banned = 'Yes') AND status like '%cancelled%' GROUP BY request_at) P1
INNER JOIN
(SELECT request_at ,COUNT(status) as Total FROM Trips WHERE client_id NOT IN(SELECT users_id FROM Users WHERE banned = 'Yes')
AND driver_id NOT IN(SELECT users_id FROM Users WHERE banned = 'Yes') GROUP BY request_at) P2 ON P1.request_at=P2.request_at WHERE P1.request_at>= '2013-10-01' AND P1.request_at<='2013-10-03')
Step 1: Exclude Banned Id Then count how many cancellations for certain date
SELECT request_at ,COUNT(status) as Cancelled FROM Trips WHERE client_id NOT IN(SELECT users_id FROM Users WHERE banned = 'Yes' AND driver_id NOT IN(SELECT users_id FROM Users WHERE banned = 'Yes') AND status like '%cancelled%' GROUP BY request_at
Step 2: Apply Same for calculating total requests between dates
(SELECT request_at ,COUNT(status) as Total FROM Trips WHERE client_id NOT IN(SELECT users_id FROM Users WHERE banned = 'Yes')
AND driver_id NOT IN(SELECT users_id FROM Users WHERE banned = 'Yes') GROUP BY request_at) P2 ON P1.request_at=P2.request_at WHERE P1.request_at>= '2013-10-01' AND P1.request_at<='2013-10-03')
Step 3: Join Step 1 and Step 2 on condition (table1.date=table2.date). From new columns we can calculate ratio from ratio=cancelled/total
(SELECT P1.request_at as Day,CAST((Cancelled*1.00)/Total as Decimal(10,2)) as 'Cancellation Rate' FROM (SELECT request_at ,COUNT(status) as Cancelled FROM Trips WHERE client_id NOT IN(SELECT users_id FROM Users WHERE banned = 'Yes')
AND driver_id NOT IN(SELECT users_id FROM Users WHERE banned = 'Yes') AND status like '%cancelled%' GROUP BY request_at) P1
INNER JOIN
(SELECT request_at ,COUNT(status) as Total FROM Trips WHERE client_id NOT IN(SELECT users_id FROM Users WHERE banned = 'Yes')
AND driver_id NOT IN(SELECT users_id FROM Users WHERE banned = 'Yes') GROUP BY request_at) P2 ON P1.request_at=P2.request_at WHERE P1.request_at>= '2013-10-01' AND P1.request_at<='2013-10-03')
Step 4: For corner cases apply step 1 and 2 for completed requests
(SELECT P1.request_at as Day,CAST((Total-Completed*1.00)/Total as Decimal(10,2)) as 'Cancellation Rate' FROM (SELECT request_at ,COUNT(status) as Completed FROM Trips WHERE client_id NOT IN(SELECT users_id FROM Users WHERE banned = 'Yes')
AND driver_id NOT IN(SELECT users_id FROM Users WHERE banned = 'Yes') AND status ='completed' GROUP BY request_at) P1
INNER JOIN
(SELECT request_at ,COUNT(status) as Total FROM Trips WHERE client_id NOT IN(SELECT users_id FROM Users WHERE banned = 'Yes')
AND driver_id NOT IN(SELECT users_id FROM Users WHERE banned = 'Yes') GROUP BY request_at) P2 ON P1.request_at=P2.request_at WHERE P1.request_at>= '2013-10-01' AND P1.request_at<='2013-10-03')
Step 5: Union Step 3 and 4
Corner Cases: 1)If No Cancellation trip occurred (Step 3 returns zero row) 2)If No Completion trip occurred (Step 4 returns zero row)
Upvotes: 0
Reputation: 317
I would create a common table expression modifying the Trips table. I'm creating a column 'user_banned' that has a 'No' value if both the driver and client are not banned, otherwise a 'Yes'.
Then, I'm calculating the cancellation rate by taking the sum of rows when the trip was canceled.
WITH mod_trips AS
(
SELECT trips.*,
CASE
WHEN u1.banned = 'No'
AND u2.banned = 'No' THEN 'No'
ELSE 'Yes'
END AS user_banned
FROM trips
INNER JOIN users U1
ON trips.client_id = u1.users_id
INNER JOIN users U2
ON trips.driver_id = u2.users_id )
SELECT mod_trips.request_at day,
Round(Sum(
CASE
WHEN mod_trips.status != 'completed' THEN 1
ELSE 0
END)/Count(*),2) AS `cancellation rate`
FROM mod_trips
INNER JOIN users
ON mod_trips.client_id = users.users_id
WHERE mod_trips.user_banned = 'No'
AND "2013-10-01" <= request_at
AND request_at <= "2013-10-03"
GROUP BY mod_trips.request_at
Jen Li has a great article about it: Trips and Users, Solution by Jen Li Chen
Upvotes: 0
Reputation: 14958
Create and Insert statements to this problem:
CREATE TABLE trips(
Id INTEGER NOT NULL PRIMARY KEY
,Client_Id INTEGER NOT NULL
,Driver_Id INTEGER NOT NULL
,City_Id INTEGER NOT NULL
,Status VARCHAR(19) NOT NULL
,Request_at DATE NOT NULL
);
INSERT INTO trips(Id,Client_Id,Driver_Id,City_Id,Status,Request_at) VALUES (1,1,10,1,'completed','2013-10-01');
INSERT INTO trips(Id,Client_Id,Driver_Id,City_Id,Status,Request_at) VALUES (2,2,11,1,'cancelled_by_driver','2013-10-01');
INSERT INTO trips(Id,Client_Id,Driver_Id,City_Id,Status,Request_at) VALUES (3,3,12,6,'completed','2013-10-01');
INSERT INTO trips(Id,Client_Id,Driver_Id,City_Id,Status,Request_at) VALUES (4,4,13,6,'cancelled_by_client','2013-10-01');
INSERT INTO trips(Id,Client_Id,Driver_Id,City_Id,Status,Request_at) VALUES (5,1,10,1,'completed','2013-10-02');
INSERT INTO trips(Id,Client_Id,Driver_Id,City_Id,Status,Request_at) VALUES (6,2,11,6,'completed','2013-10-02');
INSERT INTO trips(Id,Client_Id,Driver_Id,City_Id,Status,Request_at) VALUES (7,3,12,6,'completed','2013-10-02');
INSERT INTO trips(Id,Client_Id,Driver_Id,City_Id,Status,Request_at) VALUES (8,2,12,12,'completed','2013-10-03');
INSERT INTO trips(Id,Client_Id,Driver_Id,City_Id,Status,Request_at) VALUES (9,3,10,12,'completed','2013-10-03');
INSERT INTO trips(Id,Client_Id,Driver_Id,City_Id,Status,Request_at) VALUES (10,4,13,12,'cancelled_by_driver','2013-10-03');
CREATE TABLE users(
id INTEGER NOT NULL PRIMARY KEY
,name VARCHAR(29) NOT NULL
,amount NUMERIC(6,0) NOT NULL
,Remark VARCHAR(45)
);
INSERT INTO users(id,name,amount,Remark) VALUES (1,'No',client,NULL);
INSERT INTO users(id,name,amount,Remark) VALUES (2,'Yes',client,NULL);
INSERT INTO users(id,name,amount,Remark) VALUES (3,'No',client,NULL);
INSERT INTO users(id,name,amount,Remark) VALUES (4,'No',client,NULL);
INSERT INTO users(id,name,amount,Remark) VALUES (10,'No',driver,NULL);
INSERT INTO users(id,name,amount,Remark) VALUES (11,'No',driver,NULL);
INSERT INTO users(id,name,amount,Remark) VALUES (12,'No',driver,NULL);
INSERT INTO users(id,name,amount,Remark) VALUES (13,'No',driver,NULL);
Upvotes: 0
Reputation: 95062
You are changing your query while I am writing this. Your problem was here:
where Stat in ('cancelled_by_driver', 'cancelled_by_client')
By applying this restriction you don't see the total rides anymore that you need for the ratio.
with CTE as
(
select t.Status as Stat, t.Request_at as Day
from Trips t
join Users c on t.Client_Id = c.Users_Id
join Users d on t.Driver_Id = d.Users_Id
where c.Banned = 'No' and d.Banned = 'No'
)
select
Day,
avg(Stat in ('cancelled_by_driver', 'cancelled_by_client')) as Cancellation_Rate
from CTE
where Day in (date '2013-10-01', date '2013-10-02', date '2013-10-03')
group by Day
order by Day;
You can move the WHERE
clause inside the CTE or leave it outside or just do this without any CTE; it doesn't make a difference.
Using AVG
on the boolean expression works in MySQL, because MySQL treats true = 1 and false = 0.
Upvotes: 0