Wouter
Wouter

Reputation: 4016

Limiting the number of match collection elements in neo4j using cypher

I have a big amounts of nodes that have outgoing relations to even bigger amount of nodes. I want to be able to query for a limited amount of starting nodes, returning with it the related nodes, but the related nodes should also be limited in numbers.

Is this possible in neo4j 1.9?

For example create these nodes and have an auto index on name:

CREATE p = (bar{company:'Bar1'})<-[:FREQUENTS]-(andres {name:'Andres'})-[:WORKS_AT]->(neo{company:'Neo1'}) 
WITH andres 
CREATE (restaurant{company:'Restaurant1'})<-[:FREQUENTS]-(andres)-[:WORKS_AT]-(lib{company:'Library'}) ;

CREATE p = (bar{company:'Bar2'})<-[:FREQUENTS]-(todd {name:'Todd'})-[:WORKS_AT]->(neo{company:'Neo2'}) 
WITH todd 
CREATE (restaurant{company:'Restaurant2'})<-[:FREQUENTS]-(todd)-[:WORKS_AT]-(lib{company:'Library2'}) ;

CREATE p = (bar{company:'Bar3'})<-[:FREQUENTS]-(hank {name:'Hank'})-[:WORKS_AT]->(neo{company:'Neo3'}) 
WITH hank 
CREATE (restaurant{company:'Restaurant3'})<-[:FREQUENTS]-(hank)-[:WORKS_AT]-(lib{company:'Library3'}) ;

What I would like is something like:

START p=node:node_auto_index('*:*') 
MATCH p-[:WORKS_AT]-> c, p-[:FREQUENTS]-> f 
RETURN p, collect(distinct c.company), collect(distinct f.company) LIMIT 2;

To return 2 rows and have the collections limited to one, but without using the function on the collections, tried that on a large data set and it becomes extremely slow. So some way to LIMIT the matches..

If this is not possible in neo4j 1.9, would there be a solution in neo4j 2.0?

Upvotes: 0

Views: 816

Answers (2)

Michael Hunger
Michael Hunger

Reputation: 41706

Can you try something like this:

START p=node:node_auto_index('*:*') 
RETURN p, 
     head(extract(path in p-[:WORKS_AT]->() : head(tail(nodes(path))))) as work_company,
     head(extract(path in p-[:FREQUENTS]->() : head(tail(nodes(path))))) as visit_company

The head function on the extracted path node should be lazy so it pulls only the first one from the pattern match

If you look at the profiling output you should see that it touches only the first node each.

Upvotes: 1

Jacob Davis-Hansson
Jacob Davis-Hansson

Reputation: 2663

It could be that the : query triggers some very large operations in the indexing layer, rather than being lazy.. I would try something like this:

START p=node:node_auto_index('*:*') 
WITH p LIMIT 2
MATCH p-[:WORKS_AT]-> c, p-[:FREQUENTS]-> f return p, collect(distinct c.company), collect(distinct f.company)

Upvotes: 0

Related Questions