sgao
sgao

Reputation: 131

Profiling Neo4j path query

I want to know how a path query has been executed in neo4j. For example, I have a path query shown below:

match p=(n)-[r*1..10]->(m)
where
  (
    n.URI='http://yago-knowledge.org/resource/Jacob_T._Schwartz' OR
    n.URI='http://yago-knowledge.org/resource/Anna_Karina'
  ) AND
  filter(x IN r where type(x)=~'.*hasAcademicAdvisor.*') AND
  filter(y IN r where type(y)=~'.*isCitizenOf.*') AND
  filter(z IN r where type(z)=~'.*participatedIn.*') AND
  filter(u IN r where type(u)=~'.*happendedIn.*') AND
  filter(v IN r where type(v)=~'.*dealsWith.*')
return p, length(p) order by length(p) desc;

This query is to find paths in a graph database with source node "http://yago-knowledge.org/resource/Jacob_T._Schwartz" or "http://yago-knowledge.org/resource/Anna_Karina" that have certain relationships.

I used the PROFILE command with this query and below is the execution plan I got.

enter image description here

Note that the content in row 5 col 5 is too long, so I put *** instead of putting the actual content.

Actually, *** denotes ((((((Property(n,URI(0)) == { AUTOSTRING0} OR Property(n,URI(0)) == { AUTOSTRING1}) AND nonEmpty(FilterFunction(r,x,RelationshipTypeFunction(x) ~= /{ AUTOSTRING2}/))) AND nonEmpty(FilterFunction(r,y,RelationshipTypeFunction(y) ~= /{ AUTOSTRING3}/))) AND nonEmpty(FilterFunction(r,z,RelationshipTypeFunction(z) ~= /{ AUTOSTRING4}/))) AND nonEmpty(FilterFunction(r,u,RelationshipTypeFunction(u) ~= /{ AUTOSTRING5}/))) AND nonEmpty(FilterFunction(r,v,RelationshipTypeFunction(v) ~= /{ AUTOSTRING6}/)))

Sorry for bad format.

Can anybody help me explain the plan? Thanks in advance!!!

Upvotes: 1

Views: 127

Answers (2)

Michael Hunger
Michael Hunger

Reputation: 41706

  • Use labels,
  • Use Neo4j 2.3.x,
  • Use a constraint on :Resource(URI)
  • Considering removing the URI prefix from your URI values, it is just waste in the DB, or replace with a namespace on import
  • don't use regexps for rel-type matching, you can use sensible rel-types to begin with
  • remove ambiguity from your relationship-types

Try this for your query (after fixing the URIs and rel-types)

match p=(n:Resource)-[:hasAcademicAdvisor|:isCitizenOf|:participatedIn|:happendedIn|:dealsWith*1..10]->(m:Resource)
where n.URI IN ['Jacob_T._Schwartz', 'Anna_Karina']
return p, length(p) 
order by length(p) desc;

For query plan details see the extensive docs in the Neo4j manual

Also consider Neo4j browser for a more visual query plan

Upvotes: 0

Brian Underwood
Brian Underwood

Reputation: 10856

I believe when you look at the plan from the text console that you read it from bottom to top. This you can see that it is first finding every single path between n every other node m which is of length 1-10 (which is obviously a lot). After it has found all of those paths, then it takes them and chooses which ones to keep with the filter.

If you can, you should try putting relationship conditions on the path match rather than in a filter so that Neo4j can filter as it's traversing and stop traversing if it doesn't match, which can save you a lot of DB accesses. I notice that your relationship type matching is pretty loose, so I don't know how hard that would be, but if you know all of the possible relationship types then you should specify them like this:

match p=(n)-[r:hasAcademicAdvisor|isCitizenOf|participatedIn|happendedIn|*1..10]->(m)

I'd be curious for more info on why your using a regex to match the relationship types. I know sometimes people put dynamic values in for their relationship types (like an ID), and that's generally a code smell, in my experience. Often you can instead use a relationship property.

A couple of other notes: You're not using labels, so in order to find nodes with the URI property that you've specified it will first need to search the entire database. If you use a label and create an index or a constraint on :Label(URI) then that part will be much faster.

I would also use IN to match the URI in this case:

WHERE n.URI IN ('http://yago-knowledge.org/resource/Jacob_T._Schwartz', 'http://yago-knowledge.org/resource/Anna_Karina')

Lastly, you should also be able to take advantage of regular expressions to make this simpler (and perhaps more efficient):

filter(x IN r where type(x)=~'.*(hasAcademicAdvisor|isCitizenOf|participatedIn|happendedIn|dealsWith).*') AND

But again, that's not as efficient as putting it into your MATCH if that's possible.

Upvotes: 1

Related Questions