Cypher: Calculate based on node properties

Question

I develop a routing system to find the best routes by using several mobility offers, e. g. public transport or car pooling. Basically I have nodes, that represent bus stops, train stops etc. These stops include time schedules for each day. Additionally I have pedestrian stops. These stops are connected by relations. If I change from pedestrian to bus, or between bus lines, the relation is called SWITCH_TO. If I drive several stops without changing the service line, the stops are connected by a relation called CONNECTED_TO. If I want to go from point A to point Z and I have to change at bus stop C to another service line at bus stop D the path would look like this:

(A:Stop:Pedestrian)-[SWITCH_TO{switchTime:2}]->
(A:Stop:Bus{serviceLine:1})-[CONNECTED_TO{travelTime:5}]->
(B:Stop:Bus{serviceLine:1})-[CONNECTED_TO{travelTime:6}]->
(C:Stop:Bus{serviceLine:1})-[SWITCH_TO{switchTime:2}]->
(C:Stop:Pedestrian)-[CONNECTED_TO{travelTime:7}]->
(D:Stop:Pedestrian)-[SWITCH_TO{switchTime:2}]->
(D:Stop:Bus{serviceLine:2})-[CONNECTED_TO{travelTime:8}]->(Z:Stop:Bus{serviceLine:2})-[SWITCH_TO{switchTime:2}]->
(Z:Stop:Pedestrian)

I want to calculate based on a desired departure time (alternatively a desired arrival time) of the user the full travel time and get the 5 best connections (less time). In the example above you can see, that the SWITCH_TO relation has a switchTime of 2. This means that I need 2 minutes to switch from the current place to the bus stop (e.g. I have to look for it). The CONNECTED_TO relation travelTimes are the time periods, that the bus needs to go from one stop to another. Lets assume that I want to start at 7:00. The first switch needs 2 minutes. Therefore I have to look at the schedule of (A:Stop:Bus) if there is a departure after 7:02. Lets assume the next departure is at 7:10. then I have to wait 8 minutes. This waiting time is no fixed value but a variable time period for each specific request. I start at 7:10. I need 5+6=11 minutes to go to stop (C:Stop:Bus) and 2 minutes to go off the bus (switch to). Then I have to go 7 minutes by foot. So if have to check the schedule of service line 2. If there is a departure after 7:00+2+8+5+6+2+7+2 = 7:32, take this. If the next departure is at 7:35 I will reach my destination at 7:00+2+8+5+6+2+7+2+3+8+2= 7:45. I know, I bit complex. :)

I prepared an example here:

CREATE (newStop:Stop:Pedestrian {
    stopName : 'A-Pedestrian',
    mode : 'Pedestrian'
})
RETURN newStop;

CREATE (newStop:Stop:Bus {
    stopName : 'A-Bus',
    mode : 'Bus',
    serviceLine : '1',
    monday:[510,610,710,810,835,910],
    tuesday:[510,610,710,810,835,910],
    wednesday:[510,610,710,810,835,910],
    thursday:[510,610,710,810,835,910],
    friday:[510,610,710,810,835,910],
    saturday:[510,610,710,810,835,910],
    sunday:[510,610,710,810,835,910]
})
RETURN newStop;

CREATE (newStop:Stop:Bus {
    stopName : 'B-Bus',
    mode : 'Bus',
    serviceLine : '1',
    monday:[515,615,715,815,840,915],
    tuesday:[515,615,715,815,840,915],
    wednesday:[515,615,715,815,840,915],
    thursday:[515,615,715,815,840,915],
    friday:[515,615,715,815,840,915],
    saturday:[515,615,715,815,840,915],
    sunday:[515,615,715,815,840,915]
})
RETURN newStop;

CREATE (newStop:Stop:Bus {
    stopName : 'C-Bus',
    mode : 'Bus',
    serviceLine : '1',
    monday:[521,621,711,821,846,921],
    tuesday:[521,621,711,821,846,921],
    wednesday:[521,621,711,821,846,921],
    thursday:[521,621,711,821,846,921],
    friday:[521,621,711,821,846,921],
    saturday:[521,621,711,821,846,921],
    sunday:[521,621,711,821,846,921]
})
RETURN newStop;

CREATE (newStop:Stop:Pedestrian {
    stopName : 'C-Pedestrian',
    mode : 'Pedestrian'
})
RETURN newStop;

CREATE (newStop:Stop:Pedestrian {
    stopName : 'D-Pedestrian',
    mode : 'Pedestrian'
})
RETURN newStop;

CREATE (newStop:Stop:Bus {
    stopName : 'D-Bus',
    mode : 'Bus',
    serviceLine : '2',
    monday:[535,635,735,835,935],
    tuesday:[535,635,735,835,935],
    wednesday:[535,635,735,835,935],
    thursday:[535,635,735,835,935],
    friday:[535,635,735,835,935],
    saturday:[535,635,735,835,935],
    sunday:[]
})
RETURN newStop;

CREATE (newStop:Stop:Bus {
    stopName : 'Z-Bus',
    mode : 'Bus',
    serviceLine : '2',
    monday:[543,643,743,843,943],
    tuesday:[543,643,743,843,943],
    wednesday:[543,643,743,843,943],
    thursday:[543,643,743,843,943],
    friday:[543,643,743,843,943],
    saturday:[543,643,743,843,943],
    sunday:[]
})
RETURN newStop;

CREATE (newStop:Stop:Pedestrian {
    stopName : 'Z-Pedestrian',
    mode : 'Pedestrian'
})
RETURN newStop;




MATCH (s1:Stop), (s2:Stop)
WHERE s1.stopName = 'A-Pedestrian' AND s2.stopName = 'A-Bus'
CREATE
    (s1)-[r:SWITCH_TO{ switchTime : 2 } ]->(s2)
RETURN s1, s2, r;

MATCH (s1:Stop), (s2:Stop)
WHERE s1.stopName = 'A-Bus' AND s2.stopName = 'B-Bus'
CREATE
    (s1)-[r:CONNECTED_TO{ travelTime : 5 } ]->(s2)
RETURN s1, s2, r;

MATCH (s1:Stop), (s2:Stop)
WHERE s1.stopName = 'B-Bus' AND s2.stopName = 'C-Bus'
CREATE
    (s1)-[r:CONNECTED_TO{ travelTime : 6 } ]->(s2)
RETURN s1, s2, r;

MATCH (s1:Stop), (s2:Stop)
WHERE s1.stopName = 'C-Bus' AND s2.stopName = 'C-Pedestrian'
CREATE
    (s1)-[r:SWITCH_TO{ switchTime : 2 } ]->(s2)
RETURN s1, s2, r;

MATCH (s1:Stop), (s2:Stop)
WHERE s1.stopName = 'C-Pedestrian' AND s2.stopName = 'D-Pedestrian'
CREATE
    (s1)-[r:CONNECTED_TO{ travelTime : 7 } ]->(s2)
RETURN s1, s2, r;

MATCH (s1:Stop), (s2:Stop)
WHERE s1.stopName = 'D-Pedestrian' AND s2.stopName = 'D-Bus'
CREATE
    (s1)-[r:SWITCH_TO{ switchTime : 2 } ]->(s2)
RETURN s1, s2, r;

MATCH (s1:Stop), (s2:Stop)
WHERE s1.stopName = 'D-Bus' AND s2.stopName = 'Z-Bus'
CREATE
    (s1)-[r:CONNECTED_TO{ travelTime : 8 } ]->(s2)
RETURN s1, s2, r;

MATCH (s1:Stop), (s2:Stop)
WHERE s1.stopName = 'Z-Bus' AND s2.stopName = 'Z-Pedestrian'
CREATE
    (s1)-[r:SWITCH_TO{ switchTime : 2 } ]->(s2)
RETURN s1, s2, r;

As you can see, in some stops, the int array of departure times could be empty too (if no connection is provided on these days). Pedestrian stops dont include an schedule of course. My question is: How can I do this query by cypher? I have to sum up these times to choose the correct next departure time. I have to know when I am at my destination. And I want to show the user the best 5 connections. Is there a way to do this? If not, any suggestions how to solve this?

Thank you very much! Stefan

Edit1: Is there a way to develop this in Java? In the simplest case it could be just one shortest path but with an intelligent cost function? Instead of using fix values using a function to calculate the cost of one specific edge? Any help would be appreciated!

Tore Eschliman · Accepted Answer

[Apologies in advance, this turned into a Russian novel when I wasn't looking. And it will still only get you the single fastest route, not the 5 fastest, but hopefully someone smarter than I can improve on this.]

You are trying to do some fiendishly complex pathing based on a difficult-to-calculate cost . You are definitely going to need to refactor some so that you can fix the cost a little more tightly, and then apply apoc.algo.dijkstra to get a viable path with weight. To do that, you're going to need to change from your very general model to some kind of event "chain", organized by physical location. The transit modes combined with the weekly schedule will provide you with some structure here; the ability to walk between locations will complicate it only moderately. Along the way, we'll end up stripping out some of the less-relevant and redundant nodes. Let's dig in.

First things first, we need to be able to convert your times into something machine-parseable. You can use apoc to convert them to isoformat or similar, but for cyclical times that will require frequent ordering, and only exist at the minute scale, I say start counting midnight on Sunday morning as minute 0 and then count your way up from there. So, minutes-since-Sunday-midnight would be your time of concern, basically, and then with some tricks later you can handle the cyclical part.

MATCH (stop:Stop:Bus)
WITH stop, ['sunday', 'monday', 'tuesday', 'wednesday', 'thursday', 'friday', 'saturday'] AS days
UNWIND RANGE(0, 6) AS day_index
WITH stop, day_index, days[day_index] AS day_key
UNWIND stop[day_key] AS int_time
WITH stop, day_index * 24 * 60 AS day_minutes, int_time % 100 AS minutes, ((int_time - (int_time % 100))/100)*60 AS hour_minutes 
WHERE day_minutes IS NOT NULL
WITH stop, day_minutes + hour_minutes + minutes AS time_since_sunday
MERGE (dep:Departure:Bus {time: time_since_sunday})
MERGE (dep) - [:AT] -> (stop)
WITH stop, time_since_sunday
ORDER BY time_since_sunday
WITH stop, collect(time_since_sunday) AS times
SET stop.departure_times = times

Alright, that gives us departure events around each bus stop that represent times at which you can start a fixed-length trip elsewhere, if you are at the station before that time. Now, if we were only accounting for transit-based movement, we could connect each :Departure node at one :Stop to whatever :Departure node was available at the next stop after the transit time had elapsed (and account for wait time). Adding in walking (multi-mode transport) changes this a bit, though, since whenever transit arrives somewhere, you can just "depart" on your own two feet instantly. So, we should model :Arrival nodes to correspond to the other end of :Departure based trips, so that we can differentiate between waiting for the next transit-based :Departure and walking, time-wise.

MATCH (stop:Stop:Bus) <- [:AT] - (dep:Departure:Bus)
WITH stop, dep, dep.time AS dep_time
MATCH (stop) - [r:CONNECTED_TO] -> (other:Stop:Bus)
WITH dep, dep_time, dep_time + r.travelTime AS arrival_time, other
MERGE (a:Arrival:Bus {time: arrival_time})
MERGE (a) - [:AT] -> (other)
MERGE (dep) - [:TRAVEL {time: arrival_time - dep_time, mode: 'Bus'}] -> (a)
WITH a, arrival_time, other, other.departure_times AS dep_times
WITH a, other, arrival_time, REDUCE(s = HEAD(dep_times), x IN TAIL(dep_times) | CASE WHEN x < arrival_time THEN s WHEN s < x THEN s ELSE x END) AS next_dep_time
WITH a, other, next_dep_time, next_dep_time - arrival_time AS wait_time
MATCH (other) <- [:AT] - (next_dep:Departure:Bus {time: next_dep_time})
MERGE (a) - [:TRAVEL {time: wait_time, mode: 'Wait'}] -> (next_dep)

Alright, so for each individual line of transit, we can now calculate how long it will take to get from A to B on that line of transit alone, even if the travel is non-continuous (train hangs out at a stop, etc.) Now the fun part: integrating the walking option! The idea of a "pedestrian stop" doesn't make a lot of sense (unless you are modelling all relevant discrete space, in which case good luck and godspeed), so we're basically going to drop them entirely. A :SWITCH_TO that operates between two non-pedestrian stops will just be modeled as walking between the two stops, while a :SWITCH_TO a pedestrian transit will be converted to a long walk combining both switches and the walk itself. First the easy ones (which are not in your sample paths, but I think will be valuable to you eventually):

MATCH (stop:Stop:Bus) - [r:SWITCH_TO] -> (other:Stop)
WHERE NOT other:Pedestrian
WITH stop, other, r.switchTime AS walking_time, other.departure_times AS dep_times
MATCH (stop) <- [:AT] - (arr:Arrival)
WITH arr, other, walking_time, dep_times, arr.time + walking_time AS new_arr_time
MERGE (new_arr:Arrival:Pedestrian {time: new_arr_time})
MERGE (new_arr) - [:AT] -> (other)
MERGE (arr) - [:TRAVEL {time:walking_time, mode: 'Pedestrian'}] -> (new_arr)
WITH new_arr, other, new_arr_time, REDUCE(s = HEAD(dep_times), x IN TAIL(dep_times) | CASE WHEN x < new_arr_time THEN s WHEN s < x THEN s ELSE x END) AS next_dep_time
WITH new_arr, other, next_dep_time, next_dep_time - new_arr_time AS wait_time
MATCH (other) <- [:AT] - (next_dep:Departure {time:next_dep_time})
MERGE (new_arr) - [:TRAVEL {time: wait_time, mode: 'Wait'}] -> (next_dep)

Okay, so that takes care of the basic transfers. This model assumes, btw, that if you are going to switch between bus or train "lines", that you will model them as separate stops (with walking times of 0 if they really are the same place is fine, but it's much more complicated to track if you share :Stops). Now to take care of more complicated transfers, which were formerly modelled as switching to :Pedestrian, traveling, and then switching back:

MATCH (stop:Stop:Bus) - [r1:SWITCH_TO] -> (:Stop:Pedestrian) - [r2:CONNECTED_TO] -> (:Stop:Pedestrian) - [r3:SWITCH_TO] -> (other:Stop)
WITH stop, other, other.departure_times AS dep_times, REDUCE(s = 0 , x IN [r1, r2, r3] | s + COALESCE(x.travelTime, x.switchTime) ) AS walking_time
MATCH (stop) <- [:AT] - (arr:Arrival)
WITH arr, other, dep_times, walking_time, arr.time + walking_time AS new_arr_time
MERGE (new_arr:Arrival:Pedestrian {time:new_arr_time})
MERGE (new_arr) - [:AT] -> (other)
MERGE (arr) - [:TRAVEL {time:walking_time, mode: 'Pedestrian'}] -> (new_arr)
WITH new_arr, new_arr_time, other, REDUCE(s = HEAD(dep_times), x IN TAIL(dep_times) | CASE WHEN x < new_arr_time THEN s WHEN s < x THEN s ELSE x END) AS next_dep_time
WITH new_arr, other, next_dep_time, next_dep_time - new_arr_time AS wait_time
MATCH (other) <- [:AT] - (next_dep:Departure {time:next_dep_time})
MERGE (new_arr) - [:TRAVEL {time: wait_time, mode: 'Wait'}] -> (next_dep)

That gives you the basic framework of weighted relationships to apply the dijkstra algorithm to your inter-stop movement. There may be some estimation to figure out which stops might give good results, and how to walk to/from them from/to a given point (see earlier statement re: modeling all discrete space), but if you can pinpoint a :Departure or :Arrival node to begin at (or a few candidates), and a :Stop node for the other end of your query:

MATCH (:Stop {stopName: {begin_id} }) <- [:AT] - (d:Departure {time: {depart_time} })
WITH d
MATCH (end:Stop {stopName: {end_id} })
CALL apoc.algo.dijkstraWithDefaultWeight(d, end, 'TRAVELS>|AT>', 'time', 0) YIELD path, weight
RETURN path

Cypher: Calculate based on node properties

Answers (1)

Related Questions