Reputation: 377

Neo4j not using indices

Currently I am trying this query on Neo4j 2.2.2

At the time of this post, we have not labeled any nodes yet as we recently upgraded from Neo4j 1.x. so we don't have the option of making use of the USING clause.

I am trying to uses indices but ending up with full table scans.

START pfComp=node:Company('id:2403226') , ptComp=node:Company('id:1946633')
OPTIONAL MATCH
     (pfComp)<-[c:CHILD_OF*]-(cfComp)
WITH collect(id(cfComp)) as cfCompIds, ptComp, pfComp
OPTIONAL MATCH
                (ptComp)<-[c2:CHILD_OF*]-(ctComp)
WITH  cfCompIds, collect(id(ctComp)) AS ctCompIds
MATCH
                (fComp) -[fR:PARTICIPATES_IN]->  cdeals <-[tR:PARTICIPATES_IN]-(tComp)
WHERE
                (fComp.id = 2403226 or id(fComp) in  cfCompIds) and
                (tComp.id = 1946633 or id(tComp) in ctCompIds)
RETURN fComp, tComp, cdeals

Cypher version: CYPHER 2.2, planner: COST. 1305292 total db hits in 79128 ms.

Any help on this will be much appreciated.

Below is the full profile command output.

This is the explain plan for the query

The beginning portion of the query performs fast:

profile START pfComp=node:Company('id:2403226') , ptComp=node:Company('id:1946633')
OPTIONAL MATCH
     (pfComp)<-[c:CHILD_OF*]-(cfComp)
WITH collect(id(cfComp)) as cfCompIds, ptComp, pfComp
OPTIONAL MATCH
                (ptComp)<-[c2:CHILD_OF*]-(ctComp)
return   cfCompIds, collect(id(ctComp)) AS ctCompIds

Cypher version: CYPHER 2.2, planner: COST. 836 total db hits in 582 ms.

enter image description here

Upvotes: 3

Answers (3)

Michael Hunger

Reputation: 41706

Your second part looks like a relational join, or an additional lookup (like an n+1 select). Perhaps use the graph model instead? And the query gets simpler too.

So you would compute fComp and tComp with the initial matches, due to the *0.. it includes pfComp and ptComp each.

Then you have the cross over between fComp and tComp for the last match.

Please try it out and see how it fares:

MATCH (pfComp:lCompany)<-[c:CHILD_OF*0..]-(fComp:lCompany)
WHERE pfComp.id = 2403226
// reduce cardinality for following match
WITH collect(distinct fComp) as companies1
MATCH (ptComp:lCompany)<-[c2:CHILD_OF*]-(tComp:lCompany)
WHERE ptComp.id = 1946633
// create cross product between fComp and tComp
UNWIND companies1 as fComp
MATCH (fComp) -[fR:PARTICIPATES_IN]->(cdeals)<-[tR:PARTICIPATES_IN]-(tComp)
RETURN  fComp, tComp, cdeals;

Upvotes: 2

Ted Gulesserian

Reputation: 377

Our solution was to create labels (lCompany) and add the newer index type on the Company.id column (CREATE INDEX ON :lCompany(id)).

The query was then tweaked to use the new index:

OPTIONAL MATCH
     (pfComp:lCompany)<-[c:CHILD_OF*]-(cfComp:lCompany)
WHERE pfComp.id = 2403226
WITH 
    collect(cfComp.id) as cfCompIds
    , pfComp
OPTIONAL MATCH
        (ptComp:lCompany)<-[c2:CHILD_OF*]-(ctComp:lCompany)
WHERE ptComp.id = 1946633
WITH  cfCompIds, 
      collect(ctComp.id)  AS ctCompIds, 
      pfComp, ptComp 
MATCH
      (fComp:lCompany) -[fR:PARTICIPATES_IN]->  cdeals <-[tR:PARTICIPATES_IN]-(tComp:lCompany)
USING INDEX fComp:lCompany(id) //tComp:lCompany(id)
WHERE
                (   
                    fComp.id in (cfCompIds + [2403226])
                ) 
                and
                (
                    tComp.id in (ctCompIds + [1946633])
                )
RETURN  fComp, tComp, cdeals

There may be futher optimizations that can be done, but this is as far as we've come so far.

The profiler results are now:

Cypher version: CYPHER 2.2, planner: COST. 134151 total db hits in 1498 ms.

This is the new profile after the tweak:

enter image description here

Upvotes: 1

Martin Preusse

Reputation: 9369

There are index queries in your profile (NodeByQueryIndex).

You can define which index you want to use in your query:

MATCH (n:Swedish)
USING INDEX n:Swedish(surname)
WHERE n.surname = 'Taylor'
RETURN n

See http://neo4j.com/docs/stable/query-using.html

Upvotes: 1

Neo4j not using indices

Answers (3)

Related Questions