alexanoid
alexanoid

Reputation: 25790

Cypher Neo4j query optimization

I use the following Cypher query:

MATCH (v:Value)-[:CONTAINS]->(hv:HistoryValue) 
WHERE v.id = {valueId} 
OPTIONAL MATCH (hv)-[:CREATED_BY]->(u:User) 
WHERE {fetchCreateUsers} 
WITH u, hv 
ORDER BY hv.createDate DESC 
WITH count(hv) as count, ceil(toFloat(count(hv)) / {maxResults}) as step, COLLECT({userId: u.id, historyValueId: hv.id, historyValue: hv.originalValue, historyValueCreateDate: hv.createDate}) AS data 
RETURN REDUCE(s = [], i IN RANGE(0, count - 1, CASE step WHEN 0 THEN 1 ELSE step END) | s + data[i]) AS result, step, count

This query works fine and does exactly what I need.

Right now I'm concerned about two possible issues inside of this query from the performance point of view and Cypher best practices.

First of all, as you may see - I two times use the same count(hv) function. Will it cause the problems during the execution from the performance point of view or Cypher and Neo4j are smart enough to optimize it? If no, please show how to fix it.

And the second place is the CASE statement inside range() function? The same question here - will this CASE statement be executed only once or every time for every iteration over my range? Please show how to fix it if needed.

UPDATED

I tried to do a separator WITH for count but the query doesn't return the results(returns empty result)

MATCH (v:Value)-[:CONTAINS]->(hv:HistoryValue) 
WHERE v.id = {valueId} 
OPTIONAL MATCH (hv)-[:CREATED_BY]->(u:User) 
WHERE {fetchCreateUsers} 
WITH u, hv ORDER BY hv.createDate DESC 
WITH u, hv, count(hv) as count 
WITH u, hv, count, ceil(toFloat(count) / {maxResults}) as step, COLLECT({userId: u.id, historyValueId: hv.id, historyValue: hv.originalValue, historyValueCreateDate: hv.createDate}) AS data 
RETURN REDUCE(s = [], i IN RANGE(0, count - 1, CASE step WHEN 0 THEN 1 ELSE step END) | s + data[i]) AS result, step, count

Upvotes: 0

Views: 95

Answers (1)

Gabor Szarnyas
Gabor Szarnyas

Reputation: 5047

1 MATCH (v:Value)-[:CONTAINS]->(hv:HistoryValue) 
2 WHERE v.id = {valueId} 
3 OPTIONAL MATCH (hv)-[:CREATED_BY]->(u:User) 
4 WHERE {fetchCreateUsers} 
5 WITH u, hv 
6 ORDER BY hv.createDate DESC 
7 WITH count(hv) as count, ceil(toFloat(count(hv)) / {maxResults}) as step, COLLECT({userId: u.id, historyValueId: hv.id, historyValue: hv.originalValue, historyValueCreateDate: hv.createDate}) AS data 
8 RETURN REDUCE(s = [], i IN RANGE(0, count - 1, CASE step WHEN 0 THEN 1 ELSE step END) | s + data[i]) AS result, step, count

(1) You need to pass hv in line 5, because it's values are collected in line 7. That said, you can still do something like this:

5 WITH u, collect(hv) AS hvs, count(hv) as count
  UNWIND hvs AS hv

However, this is not very elegant and probably not worth doing.

(2) You can calculate the CASE expression in line 7:

7 WITH count, data, step, CASE step WHEN 0 THEN 1 ELSE step END AS stepFlag
8 RETURN REDUCE(s = [], i IN RANGE(0, count - 1, stepFlag) | s + data[i]) AS result, step, count

Upvotes: 1

Related Questions