Reputation: 351
I'm using a python script to generate and execute queries loaded from data in a CSV file. I've got a substantial amount of data that needs to be imported so speed is very important.
The problem I'm having is that merging between two nodes takes a very long time, and including the cypher to create the relations between the nodes causes a query to take around 3 seconds (for a query which takes around 100ms without).
Here's a small bit of the query I'm trying to execute:
MERGE (s0:Chemical{`name`: "10074-g5"})
SET s0.`name`="10074-g5"
MERGE (y0:Gene{`gene-id`: "4149"})
SET y0.`name`="MAX"
SET y0.`gene-id`="4149"
MERGE (s0)-[:INTERACTS_WITH]->(y0)
MERGE (s1:Chemical{`name`: "10074-g5"})
SET s1.`name`="10074-g5"
MERGE (y1:Gene{`gene-id`: "4149"})
SET y1.`name`="MAX"
SET y1.`gene-id`="4149"
MERGE (s1)-[:INTERACTS_WITH]->(y1)
Any suggestions on why this is running so slowly? I've got index's set up on Chemical->name and Gene->gene-id so I honestly don't understand why this runs so slowly.
Upvotes: 0
Views: 423
Reputation: 67044
SET
clauses are just setting properties to the same values they already have (as guaranteed by the preceding MERGE
clauses).SET
clauses probably only need to be executed if the MERGE
had created a new node. So, they should probably be preceded by ON CREATE
.:Gene(id)
index, whereas your code actually requires a :Gene(gene-id)
index.Below is sample Cypher code that uses the dataList
parameter (a list of maps containing the desired property values), which fixes most of the above issues. The UNWIND
clause just "unwinds" the list into individual maps.
UNWIND $dataList AS d
MERGE (s:Chemical{name: d.sName})
MERGE (y:Gene{`gene-id`: d.yId})
ON CREATE SET y.name=d.yName
MERGE (s)-[:INTERACTS_WITH]->(y)
Upvotes: 4