Reputation: 152
I'm trying to understand exactly how Cypher returns undirected paths. Here are two queries and their results (they are the same except one uses an undirected pattern):
MATCH path = (a)-[]->(b) RETURN count(path), count(distinct(path))
count(path):1236951
count(distinct(path)):1236951
MATCH path = (a)-[]-(b) RETURN count(path), count(distinct(path))
count(path):2473901
count(distinct(path)):2473901
I have two specific questions about this.
a.) Why is count(path) in the undirected case not exactly twice what it is in the directed case (it's off by 1)?
b.) In the undirected case, why is count(distinct(path)) the same as count(path)? I would have expected Cypher to match each path twice, once in each direction, and then not count the duplicates in count(distinct(path)). Is it not counting the two directions as the same path?
Upvotes: 2
Views: 2309
Reputation: 489
This is a guess, I'm new to neo4j myself, but in relation to why the undirected path isnt double the directed one, I suspect it has something to do with the direction of the relationships in the graph.
You are querying for a path in any direction but if the graph is directional then a relationship only exists along that direction and so querying a relationship in the opposite direction will return null.
Perhaps the reason the count results are the same is that every relationship defined in your graph has a direction
Upvotes: 0
Reputation: 66999
(a) I do not have an answer for why the path count for your second query is not exactly double that of your first query (other than a possible bug, or data corruption). However, you should be able to get the relationship(s) that do not have the expected count using the query below. That might provide helpful clues.
MATCH path = (a)-[r]-(b)
WITH r, COUNT(*) AS num
WHERE num <> 2
RETURN r;
(b) Each "path" consists of an ordered sequence of nodes separated by relationships. When you traverse a path in opposite directions, the result is 2 different ordered sequences, which are not equivalent. If you had instead counted distinct relationships, as below, you would have gotten an nRels
value that is half of nPaths
(or close to it, taking into account (a)):
MATCH path = (a)-[r]-(b)
RETURN count(path) AS nPaths, count(distinct r) AS nRels;
Upvotes: 2