NoobAtProgramming
NoobAtProgramming

Reputation: 21

Error: recursive reference to query "shortestflight" must not appear within a subquery

From "Montreal" as the origin point, I'm trying to find the shortest path to go to another city. But I'm having difficulty when a city is looped.
demo at db<>fiddle

create table flights(id,src,dst,cost)as values
 (0,'Montreal','NY',      742.94)
,(1,'Montreal','Detroit', 362.47)
,(2,'NY',      'Detroit', 936.60)
,(3,'Detroit', 'LA',      64.32 )
,(4,'NY',      'Montreal',94.26 )
,(5,'Detroit', 'NY',      213.86)
,(6,'LA',      'Detroit', 490.88);

If I'm going to LA, there's 2 ways:

I want that the shortest, 2-step way to LA via (only) Detroit to be shown.

I tried this code and I was expecting that the shortest steps would be shown:

WITH RECURSIVE ShortestFlight AS (
    SELECT src
         , dst
         , cost
         , cost AS total_cost
         , 1 AS steps
    FROM Flights
    WHERE src = 'Montreal'

    UNION

    SELECT f.src
         , f.dst
         , f.cost
         , sf.total_cost + f.cost AS total_cost
         , sf.steps + 1
    FROM Flights f
    JOIN ShortestFlight sf 
      ON f.src = sf.dst
    WHERE NOT EXISTS (
        SELECT FROM ShortestFlight 
        WHERE sf2.dst = f.dst 
          AND sf2.steps <= sf.steps + 1
    )
)
SELECT *
FROM ShortestFlight
ORDER BY steps
       , total_cost;

but I get the error on

ERROR:  recursive reference to query "shortestflight" must not appear within a subquery
LINE 21:         SELECT FROM ShortestFlight 
                             ^

I would like to know an alternative so my code works.

Upvotes: 1

Views: 99

Answers (2)

Zegarek
Zegarek

Reputation: 26347

Pick the shortest path using a distinct on (or order by..limit 1, or window/aggregate functions). As already pointed out by @Laurenz Albe, you also need to add cycle detection:
demo at db<>fiddle

WITH RECURSIVE ShortestFlight AS (
    SELECT src
         , dst
         , cost
         , cost AS total_cost
         , 1 AS steps
    FROM Flights
    WHERE src = 'Montreal'

    UNION

    SELECT f.src
         , f.dst
         , f.cost
         , sf.total_cost + f.cost AS total_cost
         , sf.steps + 1
    FROM Flights f
    JOIN ShortestFlight sf 
      ON f.src = sf.dst
    WHERE f.dst <>'Montreal'--no point coming back
      AND sf.dst<>'LA'--paths that already reached LA aren't extended further
)CYCLE src SET is_cycle USING path
SELECT DISTINCT ON(dst)*,path||row(dst) as full_path
FROM ShortestFlight
WHERE NOT is_cycle 
  AND dst='LA'
  AND steps     <10   --some unreasonably high limits might be more reasonable
  AND total_cost<9000 --than no limits at all
ORDER BY dst
       , steps
       , total_cost;
src dst cost total_cost steps is_cycle path full_path
Detroit LA 64.32 426.79 2 f {(Montreal),(Detroit)} {(Montreal),(Detroit),(LA)}

It skips the longer path but most importantly, this code doesn't fall into an infinite loop. Without cycle detection there was nothing that would stop the CTE from falling into a Detroit-Montreal-Detroit loop (or any pair with a return connection available), for eternity or until it times out or hits a stack limit.

While you can use recursive CTEs to implement routing algorithms, there's pgrouting extension that can handle that way more effectively. If you add airport locations you can even upgrade Dijkstra to A*.

Upvotes: 0

ValNik
ValNik

Reputation: 5916

As already pointed out by @LaurenzAlbe and @Zegrek, you need to add cycle detection.
You will not be able to check for a loop using a subquery to a recursive subquery. You can eliminate loops by manually composing a path and checking for a point in that path, or using CYCLE option for recursive query.

I also think that it is better to determine the "short" route ranking by a window function.
You may have several paths with the same number of steps. Among them, choose the path with the lowest cost.
Or immediately choose the path with the lowest cost, regardless of the number of steps. I have added additional data to the example to show these options.

See example
Source data:

id src dst cost
0 Montreal NY 542.94
1 Montreal Detroit 840.47
2 NY Detroit 779.60
3 Detroit LA 3185.32
4 NY Montreal 542.02
5 Detroit NY 779.3
6 LA Detroit 3185.88
7 NY LA 3952.88
8 Montreal Minneapolis 1526.88
9 Minneapolis Denver 1081.88
10 Denver LA 1355.88
with recursive trips as(
  select 1 lvl,id,src,dst,cost totalCost
     ,concat(src,'->',dst) route
  from flights
  where src='Montreal'
  union all
  select lvl+1,t.id,r.src,t.dst,totalCost+cost
     ,concat(route,'->',t.dst) route
  from trips r
  inner join  flights t on t.src=r.dst 
     and r.dst<>'LA'  -- cut recursion if already destination='LA'
    -- check cycles
  where  strpos(concat('->',r.route,'->'),concat('->',t.dst,'->'))=0
)
select src,dst, lvl steps,  totalcost,  route,  rn
from(
  select * 
    ,row_number()over(partition by src,dst order by lvl,totalcost) rn
  from trips
  where dst='LA'
)t
where rn=1

Output with least steps and cost

src dst steps totalcost route rn
Montreal LA 2 4025.79 Montreal->Detroit->LA 1

Output with least cost

src dst steps totalcost route rn
Montreal LA 3 3964.64 Montreal->Minneapolis->Denver->LA 1
with recursive trips as(
  select 1 lvl,id,src,dst,cost totalCost
     ,concat(src,'->',dst) route
  from flights
  where src='Montreal'
  union all
  select lvl+1,t.id,r.src,t.dst,totalCost+cost
     ,concat(route,'->',t.dst) route
  from trips r
  inner join  flights t on t.src=r.dst and r.dst<>'LA'
  where  strpos(concat('->',r.route,'->'),concat('->',t.dst,'->'))=0
)
select src,dst, lvl steps,  totalcost,  route,  rn
from(
select * 
  ,row_number()over(partition by src,dst order by lvl,totalcost) rn
from trips
where dst='LA'
)t
-- where rn=1

All path's, ordered by step,cost

src dst steps totalcost route rn
Montreal LA 2 4025.79 Montreal->Detroit->LA 1
Montreal LA 2 4495.82 Montreal->NY->LA 2
Montreal LA 3 3964.64 Montreal->Minneapolis->Denver->LA 3
Montreal LA 3 4507.86 Montreal->NY->Detroit->LA 4
Montreal LA 3 5572.65 Montreal->Detroit->NY->LA 5

fiddle

Upvotes: 0

Related Questions