How to load an adjacency list in a Neo4j database?

Question

I have about 2.7 million rows of adjacency list type data:

CID | MID | CATGY

(CATGY depends on MID)

Thus, I want to make multiple edges between various different customers and merchants. But as I load using this query-

load csv with headers from 'file:/small_data.csv' as row
create (c:cust),(m:mer),(ct:cat)
set c.id = row.CID, m.id = row.MID, ct.name = row.CATGY
create (c)-[:buys_at]->(m), (c)-[:buys]->(ct),(m)-[:has_cat]->(ct)

it creates unique nodes for all the customers and vendors. So I get 2.7 million nodes of each type. But the actual customers and vendors are less than that.

How do I create unique nodes based on CID and MID and then match them together based on the records?

few example records-

1   a    FOOD
1   b    AUTO
2   a    FOOD
2   b    AUTO

Edit:

I tried running this query with a very small sample (25 rows), but it runs endlessly, taking up more and more memory until it saturates the RAM usage.

load csv with headers from 'file:/small_1.csv' as row
MERGE (c:cust {id: row.CID})
MERGE (m:mer {id: row.MID})
CREATE (c)-[:buys_at]->(m)

Christophe Willemsen · Accepted Answer

What you want is MERGE instead of CREATE :

load csv with headers from 'file:/small_data.csv' as row
MERGE (c:cust {id: row.CID})
MERGE (m:mer {id: row.MID})
MERGE (ct:cat {name: row.CATGY})
MERGE (c)-[:buys]->(ct)
MERGE (m)-[:hast_cat]->(ct)
CREATE (c)-[:buys_at]->(m)

See the documentation for MERGE

For performances matters, make sure you have enough RAM allocated.

Secondly make sure you have indexes or unique constraints on :

cust / id
mer / id
cat / name

Thirdly, you can batch commit every 1000 lines :

USING PERIODIC COMMIT 1000
load csv with headers from 'file:/small_data.csv' as row
MERGE (c:cust {id: row.CID})
MERGE (m:mer {id: row.MID})
MERGE (ct:cat {name: row.CATGY})
MERGE (c)-[:buys]->(ct)
MERGE (m)-[:hast_cat]->(ct)
CREATE (c)-[:buys_at]->(m)

How to load an adjacency list in a Neo4j database?

Answers (2)

Related Questions