Reputation: 317
I need to query multiple paths (to draw in the UI) from different start nodes in Neo4j (could be hundreds of paths), I'm using UNION
to combine the queries but I suspect that it's not the most efficient way to retrieve the data (I'm also using indexes on the start nodes).
When creating large queries, all kinds of nasty stuff happen.
The query looks something like this:
MATCH path = ((n:actor)-[*0..2]->(g)-[p:was_filmed_at]->(k:movie_set)-[l:located]->(i:state))
WHERE n.name=’john doe1' AND (g:acted_in OR g:produced)
AND p.addDate >= '2017-12-10T00:00:00+00:00'
AND p.addDate <= '2017-12-12T08:08:06+00:00'
AND l.addDate >= '2017-12-10T00:00:00+00:00'
AND l.addDate <= '2017-12-12T08:08:06+00:00'
RETURN nodes(path), relationships(path)
LIMIT 1000
UNION
MATCH path = ((n:actor)-[*0..2]->(g)-[p:was_filmed_at]->(k:movie_set)-[l:located]->(i:state))
WHERE n.name=’john doe2' AND (g:acted_in OR g:produced)
AND p.addDate >= '2017-12-10T00:00:00+00:00'
AND p.addDate <= '2017-12-12T08:08:06+00:00'
AND l.addDate >= '2017-12-10T00:00:00+00:00'
AND l.addDate <= '2017-12-12T08:08:06+00:00'
RETURN nodes(path), relationships(path)
LIMIT 1000
UNION
MATCH path = ((n:actor)-[*0..2]->(g)-[p:was_filmed_at]->(k:movie_set)-[l:located]->(i:state))
WHERE n.name=’john doe3' AND (g:acted_in OR g:produced)
AND p.addDate >= '2017-12-10T00:00:00+00:00'
AND p.addDate <= '2017-12-12T08:08:06+00:00'
AND l.addDate >= '2017-12-10T00:00:00+00:00'
AND l.addDate <= '2017-12-12T08:08:06+00:00'
RETURN nodes(path), relationships(path)
LIMIT 1000
This is a query with only 3 paths, and it does not scale well with hundreds. Is there a simple, more efficient way to do this?
Upvotes: 1
Views: 138
Reputation: 16365
Since you are changing only the n.name
comparison, you can rewrite your query using IN operator and pass the list of names as an array. This way:
MATCH path = ((n:actor)-[*0..2]->(g)-[p:was_filmed_at]->(k:movie_set)-[l:located]->(i:state))
WHERE n.name IN ['john doe1', 'john doe2', 'john doe3']
AND (g:acted_in OR g:produced)
AND '2017-12-10T00:00:00+00:00' <= p.addDate <= '2017-12-12T08:08:06+00:00'
AND '2017-12-10T00:00:00+00:00' <= l.addDate <= '2017-12-12T08:08:06+00:00'
RETURN nodes(path), relationships(path)
LIMIT 1000
Also you can chain inequalities like: AND '2017-12-10T00:00:00+00:00' <= p.addDate <= '2017-12-12T08:08:06+00:00'
. Make sure you have an index on :actor(name)
for quick index lookup of your starting node. (Thanks to @InverseFalcon)
Upvotes: 1