Reputation: 1691
Application Background: I am working on application which consists of client and server side. Server side consumes data every 5 minutes from some external API feeder, transforms it and save it into Neo4j graph database. Client side fetches all stored data by making a call to the server side and builds chart based on received data:
http://decisionwanted.com/decisions/2/bitcoin
More saving details: every time, for newly consumed data I create new history value nodes with new relationships to the existing Value (root) node:
Issue: Server side returns all stored so far data by applying following Cypher query:
MATCH (v:Value)-[rvhv:CONTAINS]->(hv:HistoryValue)
WHERE v.id = {valueId}
OPTIONAL MATCH (hv)-[ru:CREATED_BY]->(u:User)
WHERE {fetchCreateUsers}
RETURN ru, u, rvhv, v, hv
ORDER BY hv.createDate DESC
Since total data volume is increasing after each consume operation, query performance starts reducing and latency starts increasing.
Questions:
For e.g.: There are 1000 history value nodes stored. And I want to return only every hundredth element, starting from 1st and ending 1000.
So the result set of the query should contains nodes 1, 100, 200, 300, 400, 500, 600, 700, 800, 900, 1000.
The approach looks good for me. The only problem is how can I tell Cypher query:
MATCH (v:Value)-[rvhv:CONTAINS]->(hv:HistoryValue)
WHERE v.id = {valueId}
OPTIONAL MATCH (hv)-[ru:CREATED_BY]->(u:User)
WHERE {fetchCreateUsers}
RETURN ru, u, rvhv, v, hv
ORDER BY hv.createDate DESC
to return only every hundredth element? Does anyone know how to do it?
Upvotes: 1
Views: 263
Reputation: 66967
Because your query is using ORDER BY
at the end, Cypher has to generate all result rows and then sort them. If, as stated in question #2, you want to limit the results to a time range, you should filter for that as early as possible to minimize the amount of work. For example, if you are only interested in createDate
values within parameterized startDate
and endDate
values:
MATCH (v:Value)-[rvhv:CONTAINS]->(hv:HistoryValue)
WHERE v.id = {valueId} AND {startDate} <= hv.createDate <= {endDate}
OPTIONAL MATCH (hv)-[ru:CREATED_BY]->(u:User)
WHERE {fetchCreateUsers}
RETURN ru, u, rvhv, v, hv
ORDER BY hv.createDate DESC
In addition to performing the above early-filtering, the following query returns a collection of the rows at indexes 0, 100, 200, ..., 1000:
MATCH (v:Value)-[rvhv:CONTAINS]->(hv:HistoryValue)
WHERE v.id = {valueId} AND {startDate} <= hv.createDate <= {endDate}
OPTIONAL MATCH (hv)-[ru:CREATED_BY]->(u:User)
WHERE {fetchCreateUsers}
WITH ru, u, rvhv, v, hv
ORDER BY hv.createDate DESC
LIMIT 1001
WITH COLLECT({ru: ru, u: u, rvhv: rvhv, v: v, hv: hv}) AS data
RETURN REDUCE(s = [], i IN RANGE(0, 1000, 100) | s + data[i]) AS result;
LIMIT 1001
clause minimizes the size of the data
collection to just 1001 rows of data (because index 1000 is for row 1001).RANGE(0, 1000, 100)
is used to generate the indexes of the rows of interest.REDUCE
function is used to generate the resulting collection of data at those indexes.Upvotes: 3