Reputation: 4016
I have a big amounts of nodes that have outgoing relations to even bigger amount of nodes. I want to be able to query for a limited amount of starting nodes, returning with it the related nodes, but the related nodes should also be limited in numbers.
Is this possible in neo4j 1.9?
For example create these nodes and have an auto index on name:
CREATE p = (bar{company:'Bar1'})<-[:FREQUENTS]-(andres {name:'Andres'})-[:WORKS_AT]->(neo{company:'Neo1'})
WITH andres
CREATE (restaurant{company:'Restaurant1'})<-[:FREQUENTS]-(andres)-[:WORKS_AT]-(lib{company:'Library'}) ;
CREATE p = (bar{company:'Bar2'})<-[:FREQUENTS]-(todd {name:'Todd'})-[:WORKS_AT]->(neo{company:'Neo2'})
WITH todd
CREATE (restaurant{company:'Restaurant2'})<-[:FREQUENTS]-(todd)-[:WORKS_AT]-(lib{company:'Library2'}) ;
CREATE p = (bar{company:'Bar3'})<-[:FREQUENTS]-(hank {name:'Hank'})-[:WORKS_AT]->(neo{company:'Neo3'})
WITH hank
CREATE (restaurant{company:'Restaurant3'})<-[:FREQUENTS]-(hank)-[:WORKS_AT]-(lib{company:'Library3'}) ;
What I would like is something like:
START p=node:node_auto_index('*:*')
MATCH p-[:WORKS_AT]-> c, p-[:FREQUENTS]-> f
RETURN p, collect(distinct c.company), collect(distinct f.company) LIMIT 2;
To return 2 rows and have the collections limited to one, but without using the function on the collections, tried that on a large data set and it becomes extremely slow. So some way to LIMIT the matches..
If this is not possible in neo4j 1.9, would there be a solution in neo4j 2.0?
Upvotes: 0
Views: 816
Reputation: 41706
Can you try something like this:
START p=node:node_auto_index('*:*')
RETURN p,
head(extract(path in p-[:WORKS_AT]->() : head(tail(nodes(path))))) as work_company,
head(extract(path in p-[:FREQUENTS]->() : head(tail(nodes(path))))) as visit_company
The head function on the extracted path node should be lazy so it pulls only the first one from the pattern match
If you look at the profiling output you should see that it touches only the first node each.
Upvotes: 1
Reputation: 2663
It could be that the : query triggers some very large operations in the indexing layer, rather than being lazy.. I would try something like this:
START p=node:node_auto_index('*:*')
WITH p LIMIT 2
MATCH p-[:WORKS_AT]-> c, p-[:FREQUENTS]-> f return p, collect(distinct c.company), collect(distinct f.company)
Upvotes: 0