sqlsql-serversubquerycommon-table-expression

Reputation: 815

How to correct this SQL statement?

Service table:

Car_id	Service Date
101	2016-02-05
102	2017-02-06
103	2017-05-07
104	2017-02-08
105	2017-05-09
106	2016-03-10
101	2016-08-10
101	2016-02-20
106	2016-08-15
106	2016-12-20

Expected result:

Car_id	Service_Date(Last)	Service_date(Previous)
101	2016-08-10	2016-02-20
102	2017-02-06	-
103	2017-05-07	-
104	2017-02-08	-
105	2017-05-09	-
106	2016-12-20	2016-08-15
107	-	-
108	-	-

My current code:

WITH B AS
(
    SELECT 
        A.Car_id, A.Service_Date AS Last, 
        LAG(A.Service_Date) OVER (ORDER BY A.Service_Date) AS Previous
    FROM 
        Service AS A
)
SELECT 
    E.Car_id, F.Last, F.Previous
FROM 
    ((SELECT C.Car_id, C.Last, C.Previous
      FROM 
          (SELECT
               B.Car_id, B.Last, B.Previous, 
               ROW_NUMBER() OVER (PARTITION BY B.Car_id ORDER BY B.Last DESC) AS k
           FROM 
               B) AS C
      WHERE C.K = 1)) AS F 
RIGHT JOIN 
    Car AS E ON F.Car_id = E.Car_id

My current output:

Car_id	Service_Date(Last)	Service_date(Previous)
101	2016-08-10	2016-03-10
102	2017-02-06	2016-12-20
103	2017-05-07	2017-02-08
104	2017-02-08	2017-02-06
105	2017-05-09	2017-05-07
106	2016-12-20	2016-08-15
107	NULL	NULL
108	NULL	NULL

The right join is used to extract some columns from other tables which I temporary removed. The issue with the code is that some values in the previous date column seems to be incorrect. In previous date column should contain '-' if there is no other service date for that Car_ID. Car_ID = 107 and 108 haven't received any service so it they are in the service table. I think that a CASE statement needs to be used, but I can't seem to figure where to place it.

Upvotes: 0

Answers (3)

Joel Coehoorn

Reputation: 416131

Need to partition by the car. Once we do that, we can also put both window functions in the same place and get it down to one level of nesting. Additionally, pretty much every RIGHT JOIN (there are rare exceptions) can and should be rewritten as a LEFT JOIN instead:

WITH Windowed As 
(
    SELECT Car_ID, Service_Date
        , LAG(Service_Date) OVER (PARTITION BY Car_ID ORDER BY Service_Date) Prev
        , ROW_NUMBER() OVER (PARTITION BY Car_ID ORDER BY Service_Date DESC) rn
    FROM Service
)
SELECT c.Car_ID, w.Service_Date As [Service_Date(Last)], w.Prev As [Service_Date(Previous)]
FROM Car c
LEFT JOIN Windowed w ON w.Car_ID = c.Car_ID AND w.rn = 1
ORDER BY c.Car_ID

See it work here (with appreciation to Bart for getting the fiddle started):

https://dbfiddle.uk/J7PQb_HN

previous date column should contain '-' if there is no other service date for that Car_ID

The SQL language is strongly typed. Columns have data types, and you can't mix data types across rows. In this case we have a date type if there's any sense to the schema at all, but the desired - value is not valid for that type.

You have two options. The best option here (by far) is just returning null and doing the formatting in your client code or reporting tool. Absolutely let users see the - character if that's what the specs call for, but do it in a presentation tier rather than in the database.

But if you absolutely MUST do this in SQL, the alternative is completely abandoning dates and going full string. You have to wrap the final date column selections with expressions like this, so the types are compatible:

COALESCE(CONVERT(varchar(10), Prev, 23), '-') As [Service_Date(Previous)]

And at this point you aren't really returning date values anymore at all: you're returning strings where some of them just happen to look kind of like dates.

You can see it if you really want, but again: I think you should prefer just returning null, as with my first solution:

https://dbfiddle.uk/qUwwqO6K

Finally, column and table aliases in SQL should be mnemonics, not just random letters like A, B, C, etc. You should be able to infer meaning from the chosen alias, as the code in this answer demonstrates: w for "Windowed" or c for "Car".

Upvotes: 4

T N

Reputation: 10204

Just to throw a couple of other options out there: After numbering the service dates descending per car, you can use a PIVOT table or conditional aggregation to organize your dates into columns. This can easily be expanded to show the last N service calls.

Using a pivot table:

WITH ServiceDates AS (
    SELECT PVT.*
    FROM (
        SELECT
            *,
            ROW_NUMBER() OVER(PARTITION BY Car_id ORDER BY Service_Date DESC) AS RowNum
        FROM Service
    ) SN
    PIVOT (
        MAX(Service_Date)
        FOR RowNum IN ([1], [2], [3])
    ) PVT
)
SELECT
    C.*,
    SD.[1] AS [Service_Date(Last)],
    SD.[2] AS [Service_Date(Previous)],
    SD.[3] AS [Service_Date(Previous2)]
FROM Car C
LEFT JOIN ServiceDates SD
    ON SD.Car_id = C.Car_id
ORDER BY C.Car_id;

Using conditional aggregation:

WITH ServiceDates AS (
    SELECT
        Car_id,
        MAX(CASE WHEN RowNum = 1 THEN Service_Date END) AS [Service_Date(Last)],
        MAX(CASE WHEN RowNum = 2 THEN Service_Date END) AS [Service_Date(Previous)],
        MAX(CASE WHEN RowNum = 3 THEN Service_Date END) AS [Service_Date(Previous2)]
    FROM (
        SELECT
            *,
            ROW_NUMBER() OVER(PARTITION BY Car_id ORDER BY Service_Date DESC) AS RowNum
        FROM Service
    ) SN
    GROUP BY Car_id
)
SELECT
    C.*,
    SD.[Service_Date(Last)],
    SD.[Service_Date(Previous)],
    SD.[Service_Date(Previous2)]
FROM Car C
LEFT JOIN ServiceDates SD
    ON SD.Car_id = C.Car_id
ORDER BY C.Car_id;

A pivot table is the traditional way to map rows to columns. Conditional aggregation is just an alternate technique that accomplishes the same using an aggregation function (typically MAX()) and a CASE expression. If the condition is true, the value is included. If not, the implied ELSE NULL is ignored.

(I swapped the RIGHT JOIN order around to make it a LEFT JOIN, only because that is my personal preference, as it more closely follows my thought processes. Technically, it makes no difference.)

Results (expanded to last three service dates):

Car_id	Description	Service_Date(Last)	Service_Date(Previous)	Service_Date(Previous2)
101	red honda	2016-08-10	2016-02-20	2016-02-05
102	blue ford	2017-02-06	null	null
103	green chevy	2017-05-07	null	null
104	white youota	2017-02-08	null	null
105	black mazda	2017-05-09	null	null
106	yellow bug	2016-12-20	2016-08-15	2016-03-10
107	silver cyber truck	2024-04-04	2024-03-03	2024-02-02
108	orange aztec	null	null	null
108	orange aztec	null	null	null

You can select whichever form you consider to be easiest to understand and maintain.

See this db<>fiddle for a demo. This fiddle shows a comparison of the execution plan for various answers posted so far.

Upvotes: 1

trillion

Reputation: 1401

with main as (
   select 
      car_id, 
      service_date,
      max(service_date) over (partition by car_id) as latest_date,
      coalesce(lag (service_date) over (partition by car_id order by service_date),'-') as previous_date
   from service
)
select
   car.car_id,
   main.latest_date,
   main.previous_date
from car
left join main
   on car.car_id = main.car_id
where service_date = latest_date or service_date is null

Test the Solution

if car can be gone for service twice then we need to remove duplicate using row_number.

with main as (
   select 
      car_id, 
      service_date,
      row_number() over (partition by car_id order by service_date desc) as ranking, 
      -- ranks the service date 1,2,3 i.e last = 1 (order by is in descending)
      coalesce(lag (service_date) over (partition by car_id order by service_date),'-') as previous_date
   from service
)
select
   car.car_id,
   main.latest_date,
   main.previous_date
from car
left join main
   on car.car_id = main.car_id
where ranking = 1
-- by default the row will join to the latest service date per car due to the ranking

solution # 2 fiddle results: https://dbfiddle.uk/uNy2Z8z-

Upvotes: 1

How to correct this SQL statement?

Answers (3)

Related Questions