Why is this Cypher query faster?

Question

I've just read this page of the Neo4j's official documentation.
It shows a Cypher way to retrieve friend's of friends exclusively:

MATCH (joe { name: 'Joe' })-[:knows*2..2]-(friend_of_friend)
WHERE NOT (joe)-[:knows]-(friend_of_friend)
RETURN friend_of_friend.name, COUNT(*)
ORDER BY COUNT(*) DESC , friend_of_friend.name

Why is the following way faster? :

MATCH path = shortestPath((joe { name: 'Joe' })-[:KNOWS*..2]-(friend_of_friend))
WHERE length(path) = 2
WITH nodes(path)[-1] AS secondDegreeFriends //retrieving the friend_of_friend nodes
RETURN secondDegreeFriends._name, COUNT(*)
ORDER BY COUNT(*) DESC , secondDegreeFriends.name

(33ms vs 22ms for the second query, both in the context of 182 members in the graph)

Christophe Willemsen · Accepted Answer

Firstly, without some test data it is hard to prove you some aspects of the queries differences.

I see some points here :

In the first query, you do not use a label and an indexed property so the whole pattern matching will result in a traversal matcher which is a global graph lookup.
a negation is always costly and a negation on a pattern in a WHERE clause is much costly.

I suggest you to run the queries in the shell with PROFILE and examine the execution plan results.

shortestPath on patterns not specifying relationships directions has a better algorithm for path matching.

Why is this Cypher query faster?

Answers (1)

Related Questions