Reputation: 2519
I have about 2.7 million rows of adjacency list type data:
CID | MID | CATGY
(CATGY depends on MID)
Thus, I want to make multiple edges between various different customers and merchants. But as I load using this query-
load csv with headers from 'file:/small_data.csv' as row
create (c:cust),(m:mer),(ct:cat)
set c.id = row.CID, m.id = row.MID, ct.name = row.CATGY
create (c)-[:buys_at]->(m), (c)-[:buys]->(ct),(m)-[:has_cat]->(ct)
it creates unique nodes for all the customers and vendors. So I get 2.7 million nodes of each type. But the actual customers and vendors are less than that.
How do I create unique nodes based on CID and MID and then match them together based on the records?
few example records-
1 a FOOD
1 b AUTO
2 a FOOD
2 b AUTO
Edit:
I tried running this query with a very small sample (25 rows), but it runs endlessly, taking up more and more memory until it saturates the RAM usage.
load csv with headers from 'file:/small_1.csv' as row
MERGE (c:cust {id: row.CID})
MERGE (m:mer {id: row.MID})
CREATE (c)-[:buys_at]->(m)
Upvotes: 1
Views: 1126
Reputation: 20185
What you want is MERGE
instead of CREATE :
load csv with headers from 'file:/small_data.csv' as row
MERGE (c:cust {id: row.CID})
MERGE (m:mer {id: row.MID})
MERGE (ct:cat {name: row.CATGY})
MERGE (c)-[:buys]->(ct)
MERGE (m)-[:hast_cat]->(ct)
CREATE (c)-[:buys_at]->(m)
See the documentation for MERGE
For performances matters, make sure you have enough RAM allocated.
Secondly make sure you have indexes or unique constraints on :
Thirdly, you can batch commit every 1000 lines :
USING PERIODIC COMMIT 1000
load csv with headers from 'file:/small_data.csv' as row
MERGE (c:cust {id: row.CID})
MERGE (m:mer {id: row.MID})
MERGE (ct:cat {name: row.CATGY})
MERGE (c)-[:buys]->(ct)
MERGE (m)-[:hast_cat]->(ct)
CREATE (c)-[:buys_at]->(m)
Upvotes: 3
Reputation: 141
You should make the nodes like merge (r:products{name:"fallout4"})
why use set? you could first create the unique nodes like
first you create your nodes Food, auto... with a label
a,b with a label
1,2 with a label
and then you make your relations. create(a)-[]->(food) or with merge So they will point to same node if something has the same property
Upvotes: 0