Reputation: 9369
I have a mulitgraph with mulitple relationships between nodes. I try to make a Cypher query that returns nodes connected by two relationships with different properties:
The node with label Mirna
is connected to Gene
with the REGULATES
relationship. I'd like to return all Mirna
and Gene
nodes that are connected by two REGULATES
with the source
properties first_db
and second_db
.
Here is what I tried: http://gist.neo4j.org/?4fddc897b30ef7aa4732
This works but it's very slow for large data sets. I guess because I match too much in the beginning:
MATCH (m:Mirna)-[r:REGULATES]->(g:Gene)
WITH m,g, collect(r.source) AS source
WHERE 'first_db' IN source AND 'second_db' IN source
RETURN m,g
This executes faster and gives the same results for toy data:
MATCH (m:Mirna)-[r:REGULATES { source: 'first_db' }]->(g:Gene),
(m:Mirna)-[r2:REGULATES { source: 'second_db' }]->(g:Gene)
RETURN m,g,r,r2
But is this safe and does Cypher always understand that I want two relationships between the same nodes? Is there another more efficient/elegant way to query for multiple relationships?
Upvotes: 3
Views: 1934
Reputation: 41706
Your first query does the filtering much too late, so that it cannot be included in the pattern matching, that's why it's slower (besides being a global graph query).
MATCH (m:Mirna)-[r:REGULATES]->(g:Gene)
WHERE r.source = 'first_db' OR r.source = 'second_db'
WITH m,g, collect(r.source) AS source
WHERE 'first_db' IN source AND 'second_db' IN source
RETURN m,g
If there are no false positives you can also simplify it to this:
MATCH (m:Mirna)-[r:REGULATES]->(g:Gene)
WHERE r.source = 'first_db' OR r.source = 'second_db'
WITH m,g, count(distint r.source) AS source
WHERE source = 2
RETURN m,g
Upvotes: 3