Same Cypher Query has different performance on different DBs

Question

I have a fullDB, (a graph clustered by Country) that contains ALL countries and I have various single country test DBs that contain exactly the same schema but only for one given country.

My query's "start" node, is identified via a match on a given value for a property e.g

match (country:Country{name:"UK"})

and then proceeds to the main query defined by the variable country. So I am expecting the query times to be similar given that we are starting from the same known node and it will be traversing the same number of nodes related to it in both DBs.

But I am getting very difference performance for my query if I run it in the full DB or just a single country.

I immediately thought that I must have some kind of "Cartesian Relationship" issue going on so I profiled the query in the full DB and a single country DB but the profile is exactly the same for each step in the plan. I was assuming that the profile would reveal a marked increase in db hits at some point in the plan, but the values are the same. Am I mistaken in what profile is displaying?

Some sizing: The fullDB would have 70k nodes, the test DB 672 nodes, the time in full db for the query to complete is 218764ms while the test db is circa 3407ms.

While writing this I realised that there will be an increase in the number of outgoing relationships on certain nodes (suppliers can supply different countries) which I think is probably the cause, but the question remains as to why I am not seeing any indication of this in the profiling.

Any thoughts welcome.

Same Cypher Query has different performance on different DBs

Answers (1)

Related Questions