larsw
larsw

Reputation: 3830

Is it possible to reduce/optimize this query for node degrees?

Given the following Cypher query that returns afferent (inbound) and efferent (outbound) connections, and the sum as the node degree:

START n = node(*)
RETURN n.name, length((n)-->()) AS efferent,
length((n)<--()) AS afferent,
length((n)-->()) + length((n)<--()) AS degree

Is it possible to reduce the query so that the two length() functions are not repeated in the summation in the degree column?

Upvotes: 1

Views: 67

Answers (1)

jjaderberg
jjaderberg

Reputation: 9952

You can resolve and bind the two length computations separately from and before returning by using WITH. Then you can sum those bound values while returning.

START n = node(*)
WITH n, length((n)-->()) AS efferent, length((n)<--()) AS afferent
RETURN n.name, efferent, afferent, efferent + afferent AS degree

You may want to use MATCH (n) instead of START n = node(*) if your Neo4j version is >2.0, but that's not what you're asking about so I'll assume you know what you are doing.

EDIT

In Neo4j 1.x START is how you began a query. From 2.x and on, while START is still around, MATCH is the preferred way. If you have Neo4j 2.x and don't know a particular reason why you should use START, then you should use MATCH. Here's a short explanation of why.

Your query is written to touch the entire graph. When that is the intention there is not a very big difference between START n = node(*) and MATCH (n). The execution plans do differ, but I'm not aware that the difference is very important.

If, however, you want to perform your computations only on part of the graph, and you add to your 'starting point pattern' to that effect, then there will be significant differences. If, for example, you want to perform your computation only on nodes with the :User label

START n = node(*)
WHERE n:User

will still pull up all nodes, and then apply a filter to discard those that don't have the label, whereas

MATCH (n)
WHERE n:User

will only pull up the nodes that have that label to begin with.

The general difference is this: WHERE is a dependent clause accompanying START, MATCH, OPTIONAL MATCH or WITH. When it accompanies START or WITH it does not work by modifying the operation but by filtering the results; when it accompanies MATCH and OPTIONAL MATCH it modifies (as often as it can) the operation and therefore doesn't have to filter the results. The difference is that between shouting "Everyone, if you are my child, don't go into the road" and "Kids, don't go into the road".

There are cases when WHERE is not pulled into the MATCH clause. One example is

MATCH n
WHERE n:Male OR n:Female

In this case all nodes are pulled up and then filtered, just as if we had used START instead of MATCH.

Sometimes it is easy to know which patterns in the WHERE clause are able to be pulled in to modify the MATCH. This is the case for patterns that you can move into the MATCH clause yourself, by simply rearranging the query. The first MATCH example above could also be expressed

MATCH (n:User)

There is no way, however, to do this for the WHERE clause in second MATCH example, WHERE n:Male OR n:Female.

That a WHERE pattern cannot be moved into the MATCH clause by reformulating the query is not a reliable indicator that the query planner is unable to make use of it in the match operation. Being a declarative language, you ultimately have to trust the query planner to wisely implement the instructions; trust, but verify.1,2

One other difference between START and MATCH pertains to indexing. If you use 'legacy indexing' then you need to use START to access these indices. The 'new' (about two years I believe) label indices have continuously been improved for features and efficiency and we are running out of reasons to use the old indices. I think the only reason left may be full-text indexing, for which a configured legacy lucene index is still necessary. In time this feature also will be added to the label indices. Possibly, at that point, the START clause will be removed from Cypher altogether–but that is just the author's speculation.

Upvotes: 2

Related Questions