neo4j - Improving a Cypher query

Question

I have a performance critical application which has to match multiple nodes to another node based on regex matching. My current query is as follows:

MATCH (person: Person {name: 'Mark'})
WITH person
UNWIND person.match_list AS match
MATCH (pet: Animal) 
WHERE pet.name_regex =~ match
MERGE (person)-[:OWNS_PET]->(pet) 
RETURN pet

However, this query runs VERY slow (around 500ms on my workstation). The graph contains around 500K nodes, and around 10K will match the regex.

I'm wondering whether there is a more efficient way to re-write this query to work the same but provide a performance increase.

EDIT:

When I run this query on several Persons multithreaded I get a TransientError exception

neo4j.exceptions.TransientError: ForsetiClient[3] can't acquire ExclusiveLock{owner=ForsetiClient[14]} on NODE(1889), because holders of that lock are waiting for ForsetiClient[3].

EDIT 2:

Person:name is unique and indexed

Animal:name_regex is not indexed

Tezra · Accepted Answer

First, I would start by simplifying your query as much as possible. The way you are doing it now creates a lot of wasted effort after a match has been found

MATCH (person: Person {name: 'Mark'}), (pet: Animal)
WHERE ANY(match in person.match_list WHERE pet.name_regex =~ match)
MERGE (person)-[:OWNS_PET]->(pet) 
RETURN pet

This will make it so that only 1 merge is attempted if there are multiple matches, and once one match is found, the rest won't be attempted on the same pet. This also allows Cypher to optimize to the best of it's ability on your data.

To improve the cypher further, you will need to optimize your data. For example, regex match is expensive (requires a node+string scan), if the match statements can be largely reused between people, it would be better to break them out into a node, and then connect to those so that the work of one regex match can be reused everywhere it's repeated.

neo4j - Improving a Cypher query

Answers (1)

Related Questions