Reputation: 5276
driverfor
neo4jfor
python. I have a program that dynamically creates around 10-12 queries . The final result from all queries is collected in a
list` and returned.
Below are 10 such queries:
MATCH (sslc:subSubLocality)-[:CHILD_OF]->(v4)-[:CHILD_OF]->(v3)-[:CHILD_OF]->(v2)-[:CHILD_OF]->(st:state) WHERE (st.name_wr = 'abcState') AND (sslc.name_wr= 'xyzSLC' OR sslc.name_wr= 'abcxyzcolony') RETURN st, sslc, v4, v3, v2
MATCH (slc:subLocality)-[:CHILD_OF]->(v3)-[:CHILD_OF]->(v2)-[:CHILD_OF]->(st:state) WHERE (st.name_wr = 'abcState') AND (slc.name_wr= 'xyzSLC' OR slc.name_wr= 'abcxyzcolony') RETURN st, slc, v3, v2
MATCH (loc:locality)-[:CHILD_OF]->(v2)-[:CHILD_OF]->(st:state) WHERE (st.name_wr = 'abcState') AND (loc.name_wr= 'deltax' OR loc.name_wr= 'xyzSLC' OR loc.name_wr= 'abcxyzcolony') RETURN st, loc, v2
MATCH (ct:city)-[:CHILD_OF]->(st:state) WHERE (st.name_wr = 'abcState') AND (ct.name_wr= 'deltax' OR ct.name_wr= 'abcxyz') RETURN st, ct
MATCH (sslc:subSubLocality)-[:CHILD_OF]->(v3)-[:CHILD_OF]->(v2)-[:CHILD_OF]->(ct:city) WHERE (ct.name_wr = 'deltax' OR ct.name_wr = 'abcxyz') AND (sslc.name_wr= 'xyzSLC' OR sslc.name_wr= 'abcxyzcolony') RETURN ct, sslc, v3, v2
MATCH (slc:subLocality)-[:CHILD_OF]->(v2)-[:CHILD_OF]->(ct:city) WHERE (ct.name_wr = 'deltax' OR ct.name_wr = 'abcxyz') AND (slc.name_wr= 'xyzSLC' OR slc.name_wr= 'abcxyzcolony') RETURN ct, slc, v2
MATCH (loc:locality)-[:CHILD_OF]->(ct:city) WHERE (ct.name_wr = 'deltax' OR ct.name_wr = 'abcxyz') AND (loc.name_wr= 'deltax' OR loc.name_wr= 'xyzSLC' OR loc.name_wr= 'abcxyzcolony') RETURN ct, loc
MATCH (sslc:subSubLocality)-[:CHILD_OF]->(v2)-[:CHILD_OF]->(loc:locality) WHERE (loc.name_wr = 'deltax' OR loc.name_wr = 'xyzSLC' OR loc.name_wr = 'abcxyzcolony') AND (sslc.name_wr= 'xyzSLC' OR sslc.name_wr= 'abcxyzcolony') RETURN loc, sslc, v2
MATCH (slc:subLocality)-[:CHILD_OF]->(loc:locality) WHERE (loc.name_wr = 'deltax' OR loc.name_wr = 'xyzSLC' OR loc.name_wr = 'abcxyzcolony') AND (slc.name_wr= 'xyzSLC' OR slc.name_wr= 'abcxyzcolony') RETURN loc, slc
MATCH (sslc:subSubLocality)-[:CHILD_OF]->(slc:subLocality) WHERE (slc.name_wr = 'xyzSLC' OR slc.name_wr = 'abcxyzcolony') AND (sslc.name_wr= 'xyzSLC' OR sslc.name_wr= 'abcxyzcolony') RETURN slc, sslc
The Queries might change based on the input dictionary (as I mentioned the queries are created at run-time). But the queries share the same structure.
Below is a Query Plan
that I get and it remains the same for all queries just differs in values inside.
Below is my code that fires up these requests:
def get_query_response(query_list: list)-> list:
driver = GraphDatabase.driver(uri, auth=("neo4j", "neo4j"))
with driver.session() as session:
with session.begin_transaction() as tx:
response = [record.values() for query in query_list for record in tx.run(query)]
return response
The query_list
is a collection of str
that has these queries.
The problem is the whole task takes 2 seconds to give a response. Is there any way to optimize the query or make it faster or maybe operate in milliseconds?
To answer a few questions:
3ms
- 10ms
to execute. when I fire the query on neo4j
desktop. Is it the driver that's causing the issue?i7 16GB Memory with 1TB SSD
indeices
now I have and I get a performance bump with 500ms
but now it's 1.5s
is there any way I can push it to work in milliseconds?Upvotes: 0
Views: 168
Reputation: 66967
Add appropriate indexes or uniqueness constraints so that your generated queries do not need to scan for the appropriate nodes to start working.
For example (based on your examples), you could add indexes to:
:subSubLocality(name_wr)
:subLocality(name_wr)
:locality(name_wr)
:city(name_wr)
Upvotes: 1
Reputation: 303
I can't say for sure what the cause is, but I have a few questions that should help us get closer to an answer.
• Have you tried benchmarking these queries individually? At first glance, they look like they are simple enough to complete, so I don't think this is the issue but it wouldn't hurt to know if you really need to be optimizing the queries themselves.
• You mentioned it takes "2 seconds", is that from the moment you hit 'enter' to execute your Python script (so things like initiating the connection to the Neo4j instance are included), or does it specifically take 2.0 seconds for the queries to execute?
• The docs note that prior to v3.2 of Neo4j, the Cypher planner wasn't always making the most efficient choices. If you have an earlier version, the docs mention you should default to the cost-based planner.
• Is this a local Neo4j instance? If it's hosted, what are the hardware specs of the host machine? Might not hurt to bump up the specs if possible.
• If you haven't added any custom indexing on properties and your queries always look the same, I would recommend looking into that option.
Upvotes: 0