Reputation: 131
I want to know how a path query has been executed in neo4j. For example, I have a path query shown below:
match p=(n)-[r*1..10]->(m)
where
(
n.URI='http://yago-knowledge.org/resource/Jacob_T._Schwartz' OR
n.URI='http://yago-knowledge.org/resource/Anna_Karina'
) AND
filter(x IN r where type(x)=~'.*hasAcademicAdvisor.*') AND
filter(y IN r where type(y)=~'.*isCitizenOf.*') AND
filter(z IN r where type(z)=~'.*participatedIn.*') AND
filter(u IN r where type(u)=~'.*happendedIn.*') AND
filter(v IN r where type(v)=~'.*dealsWith.*')
return p, length(p) order by length(p) desc;
This query is to find paths in a graph database with source node "http://yago-knowledge.org/resource/Jacob_T._Schwartz" or "http://yago-knowledge.org/resource/Anna_Karina" that have certain relationships.
I used the PROFILE command with this query and below is the execution plan I got.
Note that the content in row 5 col 5 is too long, so I put *** instead of putting the actual content.
Actually, *** denotes ((((((Property(n,URI(0)) == { AUTOSTRING0} OR Property(n,URI(0)) == { AUTOSTRING1}) AND nonEmpty(FilterFunction(r,x,RelationshipTypeFunction(x) ~= /{ AUTOSTRING2}/))) AND nonEmpty(FilterFunction(r,y,RelationshipTypeFunction(y) ~= /{ AUTOSTRING3}/))) AND nonEmpty(FilterFunction(r,z,RelationshipTypeFunction(z) ~= /{ AUTOSTRING4}/))) AND nonEmpty(FilterFunction(r,u,RelationshipTypeFunction(u) ~= /{ AUTOSTRING5}/))) AND nonEmpty(FilterFunction(r,v,RelationshipTypeFunction(v) ~= /{ AUTOSTRING6}/)))
Sorry for bad format.
Can anybody help me explain the plan? Thanks in advance!!!
Upvotes: 1
Views: 127
Reputation: 41706
Try this for your query (after fixing the URIs and rel-types)
match p=(n:Resource)-[:hasAcademicAdvisor|:isCitizenOf|:participatedIn|:happendedIn|:dealsWith*1..10]->(m:Resource)
where n.URI IN ['Jacob_T._Schwartz', 'Anna_Karina']
return p, length(p)
order by length(p) desc;
For query plan details see the extensive docs in the Neo4j manual
Also consider Neo4j browser for a more visual query plan
Upvotes: 0
Reputation: 10856
I believe when you look at the plan from the text console that you read it from bottom to top. This you can see that it is first finding every single path between n
every other node m
which is of length 1-10 (which is obviously a lot). After it has found all of those paths, then it takes them and chooses which ones to keep with the filter.
If you can, you should try putting relationship conditions on the path match rather than in a filter so that Neo4j can filter as it's traversing and stop traversing if it doesn't match, which can save you a lot of DB accesses. I notice that your relationship type matching is pretty loose, so I don't know how hard that would be, but if you know all of the possible relationship types then you should specify them like this:
match p=(n)-[r:hasAcademicAdvisor|isCitizenOf|participatedIn|happendedIn|*1..10]->(m)
I'd be curious for more info on why your using a regex to match the relationship types. I know sometimes people put dynamic values in for their relationship types (like an ID), and that's generally a code smell, in my experience. Often you can instead use a relationship property.
A couple of other notes: You're not using labels, so in order to find nodes with the URI
property that you've specified it will first need to search the entire database. If you use a label and create an index or a constraint on :Label(URI)
then that part will be much faster.
I would also use IN
to match the URI
in this case:
WHERE n.URI IN ('http://yago-knowledge.org/resource/Jacob_T._Schwartz', 'http://yago-knowledge.org/resource/Anna_Karina')
Lastly, you should also be able to take advantage of regular expressions to make this simpler (and perhaps more efficient):
filter(x IN r where type(x)=~'.*(hasAcademicAdvisor|isCitizenOf|participatedIn|happendedIn|dealsWith).*') AND
But again, that's not as efficient as putting it into your MATCH
if that's possible.
Upvotes: 1