Reputation: 285
I need to paginate (limit + offset) results from multiple performance intensive queries.
At first, I simulated pagination using a python's generator
first = 100
# Offset
skip = 50
cursor = 0
# Generator is returned by neo4j so we don't have a significant performance impact
q1_results = tx.run(.....)
q2_results = tx.run(.....)
for result in q1_results:
if cursor < first:
yield result
cursor += 1
for result in q2_results:
if cursor < first:
yield result
cursor += 1
However, the problem here is enforcing the offset: in order to achieve it programmatically I'll have to iterate again over the first results and do it that way:
first = 100
# Offset
skip = 50
cursor = 0
skip_cursor = 0
# Generator is returned by neo4j so we don't have a significant performance impact
q1_results = tx.run(.....)
q2_results = tx.run(.....)
for result in q1_results:
if cursor < first & skip_cursor > skip:
yield result
cursor += 1
else:
skip_cursor += 1
for result in q2_results:
if cursor < first & skip_cursor > skip:
yield result
cursor += 1
else:
skip_cursor += 1
Then I tried combining the query into one big query, but it required using aggregating functions (like collect
and distinct
) so it had an enormous performance impact and the queries became really slow.
I'm wondering if I'm missing something and if there is a proper way to achieve pagination in that scenario.
Upvotes: 1
Views: 539
Reputation: 4495
At the moment, the proper way to do this is to use SKIP
and LIMIT
in your Cypher query. The underlying protocol has no mechanism to return only a portion of your query result so even with your code, you will still generate, send and buffer the entire result set.
We have an item on our roadmap to introduce full flow control, along with a reactive API. This will enable full stack support for incremental delivery of records, with options to skip and cancel the stream. But this is complex change so won't arrive until the end of this year at the earliest. Until then, your best bet is to use Cypher keywords.
Upvotes: 2