Reputation: 449
I have 2 nodes created in graph database, origin airport and destination airport. It is related by a property named 'delayed_by'.
MATCH (origin:origin_airport {name: row.ORIGIN}),
(destination:dest_airport {name: row.DEST})
CREATE (origin)-[:delayed_by {carr_delay: row.carr_delay}]->(destination)
CREATE (origin)-[:delayed_by {weather_delay: row.weather_delay}]->
(destination)
CREATE (origin)-[:delayed_by {nas_delay: row.nas_delay}]->(destination)
delayed_by holds the value delays caused due to carrier delay, weather delay and nas delay. Here I need to group_by origin and destination and calculate the average value of sum of all 3 delays using cypher query in Neo4j. Representing in tabular format, my input file will of the format as described below:
ORIGIN DEST carr_delay weather_delay nas_delay
ABE ATL 492 56 56
ABE DTW 412 0 47
ABQ ATL 181 0 218
I am expecting the result in the below format.
ORIGIN DEST Avg_delay
ABE ATL 201.33
ABE DTW 153
ABQ ATL 133
I am using the below query:
MATCH (oa:origin_airport)-[d:delayed_by]->(da:dest_airport)
RETURN oa.name AS Origin, da.name AS Destination,
AVG((toFloat(d.carr_delay))+(toFloat(d.weather_delay))+
(toFloat(d.nas_delay))) As avg_delay
ORDER BY avg_delay DESC
LIMIT 10
But getting null values for Avg_delay.
ORIGIN DEST Avg_delay
ABE ATL NULL
ABE DTW NULL
ABQ ATL NULL
Upvotes: 1
Views: 858
Reputation: 29172
1) I recommend to change the model: for the delay relationship add the property type of delay, and the delay value save in the property value.
MATCH (origin:origin_airport {name: row.ORIGIN}),
(destination:dest_airport {name: row.DEST})
CREATE (origin)
-[:delayed_by {type: "carr_delay", value: row.carr_delay}]->
(destination)
CREATE (origin)
-[:delayed_by {type: "weather_delay", value: row.weather_delay}]->
(destination)
CREATE (origin)
-[:delayed_by {type: "nas_delay", value: row.nas_delay}]->
(destination)
2) And the desired query:
MATCH (oa:origin_airport)-[d:delayed_by]->(da:dest_airport)
RETURN oa.name AS Origin,
da.name AS Destination,
AVG( toFloat(d.value) ) As avg_delay
ORDER BY avg_delay DESC
LIMIT 10
Upvotes: 1
Reputation: 39905
The problem here is that not for every relationships you have all properties( carr_delay
, weather_delay
, nas_delay
) set.
A non existing property evaluates to NULL
- which is different from 0
. AVG(1+NULL)
evaluate to NULL
. So you need to map NULL
to 0
. To fix that consider using coalesce
:
MATCH (oa:origin_airport)-[d:delayed_by]->(da:dest_airport)
RETURN oa.name AS Origin, da.name AS Destination,
AVG(
coalesce(toFloat(d.carr_delay),0.0))+
coalesce(toFloat(d.weather_delay),0.0))+
coalesce(toFloat(d.nas_delay),0.0))
) As avg_delay
ORDER BY avg_delay DESC
LIMIT 10
Upvotes: 2