Reputation: 767

how to skip a duplicate node when creating nodes in Neo4j

I have a list of email address i use to create nodes. However if one of those email address already exist I don't want to get a CATCH '....node already exist' error since this would kill the entire query and none of the nodes would be created. So I am using a MERGE then ON CREATE to check the existence on the email address before creating the node. Problem is I need to create a relationship with a second node so after the MERGE and ON CREATE I have to use a WITH....CREATE to create the relationship and here lies the problem... the WITH is not under the 'scope' of the ON CREATE so the CREATE (a)--[r]->(b) now tries to create the a node with the email address I skipped in the MERGE above...resulting in CATCH '...already exist...' error a trash my query. Here is my CYPHER:

 commons.session
      .run['[email protected]', '[email protected]',[email protected]'] AS coll
        UNWIND coll AS invitee
        WITH DISTINCT invitee
        MERGE (i {email: invitee})
        ON CREATE
          SET i:Invitee
        WITH i,invitee
        CREATE (s:Person {email: '[email protected]})-[r:INVITED]->(i)
        RETURN i.email AS emails, COUNT(r) AS invitees)

What I am expecting to return is a list of only the email addresses where a node and relationship was created. Basically I need to be in the 'scope' of the ON CREATE since the duplicate would have been skipped. Any help in making this work would be greatly appreciated.

Upvotes: 0

Answers (2)

MichaelE

Reputation: 767

Ok... so the neither FOREACH nor CASE did the trick. Actually the WHERE clause did. My original need as stated in the question was to accept a list of emails, create nodes where a node doesn't exist for an email in the list and return a list of the created emails nodes and a count of how many....AND didn't fail on an 'email already exist...' error CATCH statement.This is the solution I mustered:

MATCH(s:Person {email: '[email protected]'})
WITH s, ['[email protected]','[email protected]','[email protected]','[email protected]'] AS coll
   UNWIND coll AS invitee
   WITH DISTINCT invitee, s
    OPTIONAL MATCH (i {email: invitee})
    WITH s,invitee, i
     WHERE i IS NULL
      CREATE (s)-[r:INVITED]->(n:Invitee {email:invitee})
  RETURN s,n, COUNT(r)

With the WHERE clause I was able to focus on only those emails for which there were no nodes already created.

Thanks to @cybersam for his suggestion of the OPTIONAL MATCH and his iterations for a possible solution.

Upvotes: 0

cybersam

Reputation: 66989

[UPDATE 2]

WITH [
  {email:'[email protected]', name:'tom'},
  {email:'[email protected]', name:'tony'},
  {email:'[email protected]', name:'mike'}] AS coll
MATCH (s:Person {email: '[email protected]'})
OPTIONAL MATCH (s)-[x:INVITED]->()
WITH s, coll, COUNT(x) AS orig_count
UNWIND coll AS invitee
WITH DISTINCT s, orig_count, invitee
OPTIONAL MATCH (i {email: invitee.email, name: invitee.name})
FOREACH(ignored IN CASE WHEN i IS NULL THEN [1] ELSE [] END |
  CREATE (s)-[r:INVITED]->(:Invitee {email: invitee.email, name: invitee.name})
)
WITH s, orig_count
OPTIONAL MATCH (s)-[x:INVITED]->()
RETURN COUNT(DISTINCT x) - orig_count AS new_relationship_count

Explanation:

The OPTIONAL MATCH clause will generate a NULL value for i if the node pattern is not found.
CASE WHEN i IS NULL THEN [1] ELSE [] END will return a non-empty list if and only if i is NULL.
If the list is empty, then the FOREACH clause will not execute its contained write clause(s). Otherwise, it will execute all of them.
At the beginning, this query first gets a count (orig_count) of the original number of INVITED relationships outgoing from s.
At the end, it gets a final count and subtracts orig_count from that to get new_relationship_count.

Upvotes: 1

how to skip a duplicate node when creating nodes in Neo4j

Answers (2)

Related Questions