Chris
Chris

Reputation: 659

How can I tune a template for a cypher query?

I'm writing a template for a query that returns a list of test scores with relevant information. With a sample dataset on Neo4j Community they are taking a long time.

Here's an example,

// Marks that were ranked top 10 on that test, and performed during a 
section between 2015-1-1 and 2016-02-07

MATCH (mark:Mark)-[r1:PERFORMED_BY]->(prsn:Person)
MATCH (mark:Mark)-[r2:PERFORMED_ON]->(test:Test)
MATCH (mark:Mark)-[r3:PERFORMED_FOR]->(course:Course)
MATCH (mark:Mark)-[r4:PERFORMED_DURING]->(sect:Section)
MATCH (s:Section)-[r5:LOCATED_IN]->(room:Room)

WHERE r2.rank in range(1,10) AND sect.datetime in range(1420099200000,1494831600000,100000)

RETURN mark.uid, prsn.uid, test.uid, course.uid, sect.uid, mark.score, course.datetime, prsn.name, course.title, room.number
r1.class, r2.rank, r3.rank

ORDER BY mark.score

The simplest of queries WHERE r2.rank = 1 can take a a few seconds. When using the range operator it will take 30+ seconds. Are there any strategies in which can I can tune the query?

Neo4j Community 3.1.1

Store info

Node id info

Upvotes: 0

Views: 167

Answers (1)

InverseFalcon
InverseFalcon

Reputation: 30407

It helps to match on the most relevant data first, since smaller datasets will be easier and faster to filter with subsequent MATCH operations. Once you've filtered down to the relevant nodes, THEN match on the rest of the nodes you'll need for your return.

Also, you'll want to make sure you have an index on :Section(datetime) for fast lookups.

Try this one:

MATCH (mark:Mark)-[r4:PERFORMED_DURING]->(sect:Section)
// faster to do an indexed range query like this
WHERE 1420099200000 <= sect.datetime <= 1494831600000
MATCH (mark)-[r2:PERFORMED_ON]->(test:Test)
WHERE 1 <= r2.rank <= 10
// now you have all relevant marks, match on the rest of the nodes you need
MATCH (mark)-[r1:PERFORMED_BY]->(prsn:Person)
MATCH (mark)-[r3:PERFORMED_FOR]->(course:Course)
MATCH (sect)-[r5:LOCATED_IN]->(room:Room)

RETURN mark.uid, prsn.uid, test.uid, course.uid, sect.uid, mark.score, course.datetime, prsn.name, course.title, room.number
r1.class, r2.rank, r3.rank

ORDER BY mark.score

Also, it's always a good idea to PROFILE your query when tuning to figure out the problem areas.

Oh, and another reason this was blowing up, you had performed a match to a :Section sect, but the following match didn't use the sect variable, so the match was finding all sections s in all rooms, which wasn't relevant to the rest of your query.

Upvotes: 2

Related Questions