bhargav
bhargav

Reputation: 31

SQL Code to find Cancellation Rates of Trips made by unbanned clients and drivers

I am trying to solve the problem in the mentioned link and came up with the following code. But my cancellation rate results are wrong. What is wrong with my code? Problem link: https://leetcode.com/problems/trips-and-users/

Code:

select t1.Day, ifnull(cast((t1.Cancelled/t2.tot_cnt) as decimal(12,2)), 0.00) as Cancellation_Rate from 
(select Day, count(Stat) as Cancelled from
(select t.Status as Stat, t.Request_at as Day from Trips t join Users c on t.Client_Id = c.Users_Id
join Users d on t.Driver_Id = d.Users_Id
where c.Banned = 'No' and d.Banned = 'No') a                       
where Stat in ('cancelled_by_driver', 'cancelled_by_client')
group by Day) t1 join

(select Day, count(Stat) as tot_cnt from 
 (select t.Status as Stat, t.Request_at as Day from Trips t join Users c on t.Client_Id = c.Users_Id
join Users d on t.Driver_Id = d.Users_Id
where c.Banned = 'No' and d.Banned = 'No') b
 group by Day) t2

on t1.Day = t2.Day where t1.Day in ('2013-10-01','2013-10-02','2013-10-03')

DDLs:

CREATE TABLE trips(
   Id         INTEGER  NOT NULL PRIMARY KEY 
  ,Client_Id  INTEGER  NOT NULL
  ,Driver_Id  INTEGER  NOT NULL
  ,City_Id    INTEGER  NOT NULL
  ,Status     VARCHAR(19) NOT NULL
  ,Request_at DATE  NOT NULL
);
INSERT INTO trips(Id,Client_Id,Driver_Id,City_Id,Status,Request_at) VALUES (1,1,10,1,'completed','2013-10-01');
INSERT INTO trips(Id,Client_Id,Driver_Id,City_Id,Status,Request_at) VALUES (2,2,11,1,'cancelled_by_driver','2013-10-01');
INSERT INTO trips(Id,Client_Id,Driver_Id,City_Id,Status,Request_at) VALUES (3,3,12,6,'completed','2013-10-01');
INSERT INTO trips(Id,Client_Id,Driver_Id,City_Id,Status,Request_at) VALUES (4,4,13,6,'cancelled_by_client','2013-10-01');
INSERT INTO trips(Id,Client_Id,Driver_Id,City_Id,Status,Request_at) VALUES (5,1,10,1,'completed','2013-10-02');
INSERT INTO trips(Id,Client_Id,Driver_Id,City_Id,Status,Request_at) VALUES (6,2,11,6,'completed','2013-10-02');
INSERT INTO trips(Id,Client_Id,Driver_Id,City_Id,Status,Request_at) VALUES (7,3,12,6,'completed','2013-10-02');
INSERT INTO trips(Id,Client_Id,Driver_Id,City_Id,Status,Request_at) VALUES (8,2,12,12,'completed','2013-10-03');
INSERT INTO trips(Id,Client_Id,Driver_Id,City_Id,Status,Request_at) VALUES (9,3,10,12,'completed','2013-10-03');
INSERT INTO trips(Id,Client_Id,Driver_Id,City_Id,Status,Request_at) VALUES (10,4,13,12,'cancelled_by_driver','2013-10-03');


CREATE TABLE users(
   id       INTEGER  NOT NULL PRIMARY KEY 
  ,name     VARCHAR(29) NOT NULL
  ,amount   NUMERIC(6,0) NOT NULL
  ,Remark   VARCHAR(45)
);
INSERT INTO users(id,name,amount,Remark) VALUES (1,'No',client,NULL);
INSERT INTO users(id,name,amount,Remark) VALUES (2,'Yes',client,NULL);
INSERT INTO users(id,name,amount,Remark) VALUES (3,'No',client,NULL);
INSERT INTO users(id,name,amount,Remark) VALUES (4,'No',client,NULL);
INSERT INTO users(id,name,amount,Remark) VALUES (10,'No',driver,NULL);
INSERT INTO users(id,name,amount,Remark) VALUES (11,'No',driver,NULL);
INSERT INTO users(id,name,amount,Remark) VALUES (12,'No',driver,NULL);
INSERT INTO users(id,name,amount,Remark) VALUES (13,'No',driver,NULL);

Upvotes: 0

Views: 699

Answers (5)

Donald Yakam
Donald Yakam

Reputation: 1

select 
    request_at as "Day",
    round(
        (sum(case when status = "cancelled_by_driver" or status = "cancelled_by_client" then 1 
        else 0 end) / count(status)), 2) as "Cancellation Rate"
from
    Trips
where 
    client_id not in (select users_id from Users where role = 'client' and banned ='Yes') 
and 
    driver_id not in (select users_id from Users where role = 'driver' and banned ='Yes') 
and 
    request_at >= "2013-10-01" and request_at <= "2013-10-03"
group by 
    request_at

Upvotes: 0

Sadman Jahin
Sadman Jahin

Reputation: 3

Solution:

(SELECT P1.request_at as Day,CAST((Total-Completed*1.00)/Total as Decimal(10,2)) as 'Cancellation Rate' FROM (SELECT request_at ,COUNT(status) as Completed FROM Trips WHERE client_id NOT IN(SELECT users_id FROM Users WHERE banned = 'Yes')
 AND driver_id NOT IN(SELECT users_id FROM Users WHERE banned = 'Yes') AND status ='completed' GROUP BY request_at) P1
INNER JOIN
 (SELECT request_at ,COUNT(status) as Total FROM Trips WHERE client_id NOT IN(SELECT users_id FROM Users WHERE banned = 'Yes')
 AND driver_id NOT IN(SELECT users_id FROM Users WHERE banned = 'Yes') GROUP BY request_at) P2 ON P1.request_at=P2.request_at WHERE P1.request_at>= '2013-10-01' AND P1.request_at<='2013-10-03')
 
  UNION


(SELECT P1.request_at as Day,CAST((Cancelled*1.00)/Total as Decimal(10,2)) as 'Cancellation Rate' FROM (SELECT request_at ,COUNT(status) as Cancelled FROM Trips WHERE client_id NOT IN(SELECT users_id FROM Users WHERE banned = 'Yes')
 AND driver_id NOT IN(SELECT users_id FROM Users WHERE banned = 'Yes') AND status like '%cancelled%' GROUP BY request_at) P1
INNER JOIN
 (SELECT request_at ,COUNT(status) as Total FROM Trips WHERE client_id NOT IN(SELECT users_id FROM Users WHERE banned = 'Yes')
 AND driver_id NOT IN(SELECT users_id FROM Users WHERE banned = 'Yes') GROUP BY request_at) P2 ON P1.request_at=P2.request_at WHERE P1.request_at>= '2013-10-01' AND P1.request_at<='2013-10-03')
 

Step 1: Exclude Banned Id Then count how many cancellations for certain date

SELECT request_at ,COUNT(status) as Cancelled FROM Trips WHERE client_id NOT IN(SELECT users_id FROM Users WHERE banned = 'Yes' AND driver_id NOT IN(SELECT users_id FROM Users WHERE banned = 'Yes') AND status like '%cancelled%' GROUP BY request_at

Step 2: Apply Same for calculating total requests between dates

(SELECT request_at ,COUNT(status) as Total FROM Trips WHERE client_id NOT IN(SELECT users_id FROM Users WHERE banned = 'Yes')
 AND driver_id NOT IN(SELECT users_id FROM Users WHERE banned = 'Yes') GROUP BY request_at) P2 ON P1.request_at=P2.request_at WHERE P1.request_at>= '2013-10-01' AND P1.request_at<='2013-10-03')

Step 3: Join Step 1 and Step 2 on condition (table1.date=table2.date). From new columns we can calculate ratio from ratio=cancelled/total

(SELECT P1.request_at as Day,CAST((Cancelled*1.00)/Total as Decimal(10,2)) as 'Cancellation Rate' FROM (SELECT request_at ,COUNT(status) as Cancelled FROM Trips WHERE client_id NOT IN(SELECT users_id FROM Users WHERE banned = 'Yes')
 AND driver_id NOT IN(SELECT users_id FROM Users WHERE banned = 'Yes') AND status like '%cancelled%' GROUP BY request_at) P1
INNER JOIN
 (SELECT request_at ,COUNT(status) as Total FROM Trips WHERE client_id NOT IN(SELECT users_id FROM Users WHERE banned = 'Yes')
 AND driver_id NOT IN(SELECT users_id FROM Users WHERE banned = 'Yes') GROUP BY request_at) P2 ON P1.request_at=P2.request_at WHERE P1.request_at>= '2013-10-01' AND P1.request_at<='2013-10-03')

Step 4: For corner cases apply step 1 and 2 for completed requests

(SELECT P1.request_at as Day,CAST((Total-Completed*1.00)/Total as Decimal(10,2)) as 'Cancellation Rate' FROM (SELECT request_at ,COUNT(status) as Completed FROM Trips WHERE client_id NOT IN(SELECT users_id FROM Users WHERE banned = 'Yes')
 AND driver_id NOT IN(SELECT users_id FROM Users WHERE banned = 'Yes') AND status ='completed' GROUP BY request_at) P1
INNER JOIN
 (SELECT request_at ,COUNT(status) as Total FROM Trips WHERE client_id NOT IN(SELECT users_id FROM Users WHERE banned = 'Yes')
 AND driver_id NOT IN(SELECT users_id FROM Users WHERE banned = 'Yes') GROUP BY request_at) P2 ON P1.request_at=P2.request_at WHERE P1.request_at>= '2013-10-01' AND P1.request_at<='2013-10-03')

Step 5: Union Step 3 and 4

Corner Cases: 1)If No Cancellation trip occurred (Step 3 returns zero row) 2)If No Completion trip occurred (Step 4 returns zero row)

Upvotes: 0

Anish
Anish

Reputation: 317

I would create a common table expression modifying the Trips table. I'm creating a column 'user_banned' that has a 'No' value if both the driver and client are not banned, otherwise a 'Yes'.

Then, I'm calculating the cancellation rate by taking the sum of rows when the trip was canceled.

WITH mod_trips AS
(
           SELECT     trips.*,
                      CASE
                                 WHEN u1.banned = 'No'
                                 AND  u2.banned = 'No' THEN 'No'
                                 ELSE 'Yes'
                      END AS user_banned
           FROM       trips
           INNER JOIN users U1
           ON         trips.client_id = u1.users_id
           INNER JOIN users U2
           ON         trips.driver_id = u2.users_id )
SELECT     mod_trips.request_at day,
           Round(Sum(
           CASE
                      WHEN mod_trips.status != 'completed' THEN 1
                      ELSE 0
           END)/Count(*),2) AS `cancellation rate`
FROM       mod_trips
INNER JOIN users
ON         mod_trips.client_id = users.users_id
WHERE      mod_trips.user_banned = 'No'
AND        "2013-10-01" <= request_at
AND        request_at <= "2013-10-03"
GROUP BY   mod_trips.request_at

Jen Li has a great article about it: Trips and Users, Solution by Jen Li Chen

Upvotes: 0

Luuk
Luuk

Reputation: 14958

Create and Insert statements to this problem:

CREATE TABLE trips(
   Id         INTEGER  NOT NULL PRIMARY KEY 
  ,Client_Id  INTEGER  NOT NULL
  ,Driver_Id  INTEGER  NOT NULL
  ,City_Id    INTEGER  NOT NULL
  ,Status     VARCHAR(19) NOT NULL
  ,Request_at DATE  NOT NULL
);
INSERT INTO trips(Id,Client_Id,Driver_Id,City_Id,Status,Request_at) VALUES (1,1,10,1,'completed','2013-10-01');
INSERT INTO trips(Id,Client_Id,Driver_Id,City_Id,Status,Request_at) VALUES (2,2,11,1,'cancelled_by_driver','2013-10-01');
INSERT INTO trips(Id,Client_Id,Driver_Id,City_Id,Status,Request_at) VALUES (3,3,12,6,'completed','2013-10-01');
INSERT INTO trips(Id,Client_Id,Driver_Id,City_Id,Status,Request_at) VALUES (4,4,13,6,'cancelled_by_client','2013-10-01');
INSERT INTO trips(Id,Client_Id,Driver_Id,City_Id,Status,Request_at) VALUES (5,1,10,1,'completed','2013-10-02');
INSERT INTO trips(Id,Client_Id,Driver_Id,City_Id,Status,Request_at) VALUES (6,2,11,6,'completed','2013-10-02');
INSERT INTO trips(Id,Client_Id,Driver_Id,City_Id,Status,Request_at) VALUES (7,3,12,6,'completed','2013-10-02');
INSERT INTO trips(Id,Client_Id,Driver_Id,City_Id,Status,Request_at) VALUES (8,2,12,12,'completed','2013-10-03');
INSERT INTO trips(Id,Client_Id,Driver_Id,City_Id,Status,Request_at) VALUES (9,3,10,12,'completed','2013-10-03');
INSERT INTO trips(Id,Client_Id,Driver_Id,City_Id,Status,Request_at) VALUES (10,4,13,12,'cancelled_by_driver','2013-10-03');


CREATE TABLE users(
   id       INTEGER  NOT NULL PRIMARY KEY 
  ,name     VARCHAR(29) NOT NULL
  ,amount   NUMERIC(6,0) NOT NULL
  ,Remark   VARCHAR(45)
);
INSERT INTO users(id,name,amount,Remark) VALUES (1,'No',client,NULL);
INSERT INTO users(id,name,amount,Remark) VALUES (2,'Yes',client,NULL);
INSERT INTO users(id,name,amount,Remark) VALUES (3,'No',client,NULL);
INSERT INTO users(id,name,amount,Remark) VALUES (4,'No',client,NULL);
INSERT INTO users(id,name,amount,Remark) VALUES (10,'No',driver,NULL);
INSERT INTO users(id,name,amount,Remark) VALUES (11,'No',driver,NULL);
INSERT INTO users(id,name,amount,Remark) VALUES (12,'No',driver,NULL);
INSERT INTO users(id,name,amount,Remark) VALUES (13,'No',driver,NULL);

Upvotes: 0

Thorsten Kettner
Thorsten Kettner

Reputation: 95062

You are changing your query while I am writing this. Your problem was here:

where Stat in ('cancelled_by_driver', 'cancelled_by_client')

By applying this restriction you don't see the total rides anymore that you need for the ratio.

with CTE as  
(
  select t.Status as Stat, t.Request_at as Day 
  from Trips t 
  join Users c on t.Client_Id = c.Users_Id
  join Users d on t.Driver_Id = d.Users_Id
  where c.Banned = 'No' and d.Banned = 'No'
)    
select
  Day,
  avg(Stat in ('cancelled_by_driver', 'cancelled_by_client')) as Cancellation_Rate 
from CTE
where Day in  (date '2013-10-01', date '2013-10-02', date '2013-10-03')
group by Day
order by Day;

You can move the WHERE clause inside the CTE or leave it outside or just do this without any CTE; it doesn't make a difference.

Using AVG on the boolean expression works in MySQL, because MySQL treats true = 1 and false = 0.

Upvotes: 0

Related Questions