How to paginate a search result using same query with different skip and limit values

Question

Some of you might argue that this is a question for programmers.stackexchange.com but, having read through Help Center of Stack Overflow I believe this is a specific programming problem and I am more likely to get a response here.

I have a webapp that uses ExpressJS with Neo4j database as backend. I have a search screen, where I would like to use the power of Neo4j's relationships. The search screen accepts one or more values (i.e. manufacture year, fuel type, gearbox, etc etc) and then make a post request to ExpressJS where I construct a cypher query using the parameters of POST request as shown below as an example:

MATCH
  (v:VEHICLE),(v)-[:VGEARBOX_IS]->(:VGBOX{type:'MANUAL'}),
  (v)-[:VCONDITION_IS]->(:VCONDITION{condition:'USED'})  
WITH DISTINCT v
WHERE  v.manufacture_year = 1990
MATCH (v)-[r]->(info)
RETURN v AS vehicle, COLLECT({type:type(r), data:info}) AS details

Let's say that running above query, returns the following three vehicles and its properties

If more than 20 vehicles in result then I want to paginate the result and I know how that works, we make use of SKIP and LIMIT as shown below:

MATCH
  (v:VEHICLE)
OPTIONAL MATCH (v)-[r:VFUEL_TYPE|:VGEARBOX_IS|:VHAVING_COLOR|...]->(info)
RETURN
  v.vehicle_id AS vehicle_id,
  v.engine_size AS engine_size,
  v.published_date AS published_date,
  COUNT(v) AS count,
  COLLECT({type:type(r), data:info}) as data
ORDER BY v.published_date DESC
SKIP 20
LIMIT 16

Here is the workflow,

User navigates to search screen which is a form with POST method and various input fields.
User selects some options based on which he/she wish to search.
User then submits the form, which makes a post request to the server.
This request is handled by a ROUTE which uses the parameters of the request to construct a cypher query shown above. It runs the cypher against the Neo4j database and receive the result.
Let's assume, there are 200 vehicles that match the search result. I then want to display only 20 of those results and provide a next/previous button.
When user is done seeing the first 20, he/she wants to see the next 20, that is when I have to re-run the same query that user submitted initially but, with SKIP value of 20 (SKIP value keeps incrementing 20 as user navigates to next page, and decrement 20 as your moves to previous page).

My question is, what is the best approach to save search request (or the cypher generated by original request) so that when user clicks next/previous page, I re-run the original search cypher query with different SKIP value? I don't want to make a fresh POST request every time the user goes to next/previous page. This problem can be resolved in the following manner but, not sure which is more performance-friendly?

Every time the user clicks next or previous page, I make a new POST request with preserved values of original request and rebuild the cypher query (POSTs can be costly - I want to avoid this, please suggest why this is better option)
I store the original cypher query in Redis and whenever the user clicks next or previous, I retrieve the query specific to that user (need to handle this either via cookie, session or some sort of hidden uuid) from Redis, supply the new value for SKIP and re-run it (somehow I have to handle when I should delete this entry from Redis - deletion should happen when user change his search or abandon the page/site).
I store the query in session (user does not have to be logged in) or some other temporary storage (than Redis) that provide fast access (not sure if that is safe and efficient)

I am sure somebody came across this issue and it in an efficient manner which is why I post the question here. Please advise how I can best resolve this problem.

Brian Underwood · Accepted Answer

As far as performance goes, the first thing that you should absolutely do is to use Cypher parameters. This is a way to separate your query string from your dynamic data. This has the advantage that you are guarded against injection attacks, but it also is more performance because if your query string doesn't change, Neo4j can cache a query plan for your query and use it over and over again. With parameters your first query would look like this:

MATCH
  (v:VEHICLE),(v)-[:VGEARBOX_IS]->(:VGBOX{type: {vgearbox_type}}),
  (v)-[:VCONDITION_IS]->(:VCONDITION{condition: {vcondition}})  
WITH DISTINCT v
WHERE  v.manufacture_year = {manufacture_year}
MATCH (v)-[r]->(info)
RETURN v AS vehicle, COLLECT({type:type(r), data:info}) AS details
SKIP ({page} - 1) * {per_page}
LIMIT {per_page}

Your javascript library for Neo4j should allow you to pass down a separate object. Here is what the object would look like represented in json:

{
  "vgearbox_type": "MANUAL",
  "vcondition": "USED",
  "manufacture_year": 1990,
  "page": 1,
  "per_page": 20
}

I don't really see much of a problem with making a fresh query to the database from Node each time. You should benchmark how long it actually takes to see if it really is a problem.

If it is something that you want to address with caching, it depends on your server setup. If the Node app and the DB are on the same machine or very close to each other, probably it's not important. Otherwise you could use redis to cache based on a key which is a composite of the values that you are querying for. If you are thinking of caching on a per-user basis, you could even use the browser's local storage, but are users often re-visiting the same pages over and over? How static is your data and does the data vary from user to user?

How to paginate a search result using same query with different skip and limit values

Answers (1)

Related Questions