goelakash
goelakash

Reputation: 2519

How to load an adjacency list in a Neo4j database?

I have about 2.7 million rows of adjacency list type data:

CID | MID | CATGY

(CATGY depends on MID)

Thus, I want to make multiple edges between various different customers and merchants. But as I load using this query-

load csv with headers from 'file:/small_data.csv' as row
create (c:cust),(m:mer),(ct:cat)
set c.id = row.CID, m.id = row.MID, ct.name = row.CATGY
create (c)-[:buys_at]->(m), (c)-[:buys]->(ct),(m)-[:has_cat]->(ct)

it creates unique nodes for all the customers and vendors. So I get 2.7 million nodes of each type. But the actual customers and vendors are less than that.

How do I create unique nodes based on CID and MID and then match them together based on the records?

few example records-

1   a    FOOD
1   b    AUTO
2   a    FOOD
2   b    AUTO

Edit:

I tried running this query with a very small sample (25 rows), but it runs endlessly, taking up more and more memory until it saturates the RAM usage.

load csv with headers from 'file:/small_1.csv' as row
MERGE (c:cust {id: row.CID})
MERGE (m:mer {id: row.MID})
CREATE (c)-[:buys_at]->(m)

Upvotes: 1

Views: 1126

Answers (2)

Christophe Willemsen
Christophe Willemsen

Reputation: 20185

What you want is MERGE instead of CREATE :

load csv with headers from 'file:/small_data.csv' as row
MERGE (c:cust {id: row.CID})
MERGE (m:mer {id: row.MID})
MERGE (ct:cat {name: row.CATGY})
MERGE (c)-[:buys]->(ct)
MERGE (m)-[:hast_cat]->(ct)
CREATE (c)-[:buys_at]->(m)

See the documentation for MERGE

For performances matters, make sure you have enough RAM allocated.

Secondly make sure you have indexes or unique constraints on :

  • cust / id
  • mer / id
  • cat / name

Thirdly, you can batch commit every 1000 lines :

USING PERIODIC COMMIT 1000
load csv with headers from 'file:/small_data.csv' as row
MERGE (c:cust {id: row.CID})
MERGE (m:mer {id: row.MID})
MERGE (ct:cat {name: row.CATGY})
MERGE (c)-[:buys]->(ct)
MERGE (m)-[:hast_cat]->(ct)
CREATE (c)-[:buys_at]->(m)

Upvotes: 3

Mvde
Mvde

Reputation: 141

You should make the nodes like merge (r:products{name:"fallout4"})

why use set? you could first create the unique nodes like

first you create your nodes Food, auto... with a label

a,b with a label

1,2 with a label

and then you make your relations. create(a)-[]->(food) or with merge So they will point to same node if something has the same property

Upvotes: 0

Related Questions