Martin Preusse
Martin Preusse

Reputation: 9369

neo4j 2.0/Cypher: match nodes connected by 2 different relations

I have a mulitgraph with mulitple relationships between nodes. I try to make a Cypher query that returns nodes connected by two relationships with different properties:

The node with label Mirna is connected to Gene with the REGULATES relationship. I'd like to return all Mirna and Gene nodes that are connected by two REGULATES with the source properties first_db and second_db.

Graph schema

Here is what I tried: http://gist.neo4j.org/?4fddc897b30ef7aa4732

This works but it's very slow for large data sets. I guess because I match too much in the beginning:

MATCH (m:Mirna)-[r:REGULATES]->(g:Gene)
WITH m,g, collect(r.source) AS source    
WHERE 'first_db' IN source AND 'second_db' IN source
RETURN m,g

This executes faster and gives the same results for toy data:

MATCH (m:Mirna)-[r:REGULATES { source: 'first_db' }]->(g:Gene),
      (m:Mirna)-[r2:REGULATES { source: 'second_db' }]->(g:Gene)
RETURN m,g,r,r2

But is this safe and does Cypher always understand that I want two relationships between the same nodes? Is there another more efficient/elegant way to query for multiple relationships?

Upvotes: 3

Views: 1934

Answers (1)

Michael Hunger
Michael Hunger

Reputation: 41706

Your first query does the filtering much too late, so that it cannot be included in the pattern matching, that's why it's slower (besides being a global graph query).

MATCH (m:Mirna)-[r:REGULATES]->(g:Gene)
WHERE r.source = 'first_db' OR r.source = 'second_db'
WITH m,g, collect(r.source) AS source    
WHERE 'first_db' IN source AND 'second_db' IN source
RETURN m,g

If there are no false positives you can also simplify it to this:

MATCH (m:Mirna)-[r:REGULATES]->(g:Gene)
WHERE r.source = 'first_db' OR r.source = 'second_db'
WITH m,g, count(distint r.source) AS source    
WHERE source = 2
RETURN m,g

Upvotes: 3

Related Questions