Sharath
Sharath

Reputation: 57

neo4j create relationship not giving output

I have 2 csv files: Entity, with 2.8M records and Rships, with 4.2M records. Entity has a list of ENT_ID and PARENTID. If ENT_ID has PARENTID '0', it means no parent. If it does, then it would be one of the ENT_ID from above. I need to create a relationship of ENT_ID and PARENTID. I tried loading this through the import tool available for Neo4j2.3.4 community edition, however I kept receiving very many errors. In the end, I managed to load it using the load csv cypher query.

USING PERIODIC COMMIT
LOAD CSV WITH HEADERS FROM "file:///C:/...(read file address here)/Entities.txt" AS Entity FIELDTERMINATOR '|'
CREATE (n:Entity{ENT_ID: Entity.ENT_ID,NAME: Entity.NAME,ENTRYTYPE: Entity.ENTRYTYPE,  PARENTID: Entity.PARENTID,ENTRYCATEGORY: Entity.ENTRYCATEGORY,ENTRYSUBCATEGORY: Entity.ENTRYSUBCATEGORY,COUNTRY: Entity.COUNTRY,PWC_ADL_ID: Entity.PWC_ADL_ID })

I created relationship between the PARENTID and ENTID using the following:

PROFILE
MATCH(Entity)
MATCH (a:Entity {ENT_ID : Entity.ENT_ID})
WITH Entity, a
MATCH (b:Entity {ENT_ID : Entity.PARENTID})
WITH a,b
MERGE (a)-[r:RELATION]->(b)

Now I loaded the CSV file for relationship as :

USING PERIODIC COMMIT
LOAD CSV WITH HEADERS FROM "file:///C:/.../EntitiesRelationships.txt" AS Rships FIELDTERMINATOR '|' 
CREATE (n:Rships{RID: Rships.RID,Ent_IDParent: Rships.Ent_IDParent,Ent_IDChild: Rships.Ent_IDChild,  RelationID: Rships.RelationID })

I created index on both the CSVs:

CREATE INDEX ON :Entity(ENT_ID)

CREATE INDEX ON :Rships(Ent_IDParent)
CREATE INDEX ON :Rships(Ent_IDChild)

Till here the above code runs fine although it takes nearly a day to do so:

Now, when I try to create relationship using the query:

PROFILE
Match(Rships)
MATCH(a:Rships {ENT_IDParent: Rships.ENT_IDParent})
WITH Rships, a
MATCH(b:Rships {ENT_IDParent: Rships.ENT_IDChild})
WITH a,b
MERGE (a)-[r:RELATION]->(b)

This query keeps on running for about an hour without any result.

Help would be appreciated.

Thanks

Upvotes: 2

Views: 42

Answers (1)

Michael Hunger
Michael Hunger

Reputation: 41706

Labels, Properties and relationship-types are case-sensitive. Your index has different spellings than the properties in your statement.

You should also look at your statements with EXPLAIN or PROFILE then you see it immediately.

For the 2nd statement I would also use LOAD CSV again, to drive the lookup of the entities, so that you get periodic commit.

In total your import shouldn't take longer than a few minutes.

// create unique constraint
CREATE CONSTRAINT ON (n:Entity) ASSERT n.ENT_ID IS UNIQUE;

USING PERIODIC COMMIT 
LOAD CSV WITH HEADERS FROM "file:///C:/...(read file address here)/Entities.txt" AS row FIELDTERMINATOR '|' 
CREATE (n:Entity {ENT_ID: row.ENT_ID,NAME: row.NAME,ENTRYTYPE: row.ENTRYTYPE, PARENTID: row.PARENTID, ENTRYCATEGORY: row.ENTRYCATEGORY, ENTRYSUBCATEGORY: row.ENTRYSUBCATEGORY, COUNTRY: row.COUNTRY, PWC_ADL_ID: row.PWC_ADL_ID });
// you can also use this if you want to set all properties:
// CREATE (n:Entity) SET n = row

USING PERIODIC COMMIT 
LOAD CSV WITH HEADERS FROM "file:///C:/...(read file address here)/Entities.txt" AS row FIELDTERMINATOR '|' 
MATCH (a:Entity {ENT_ID : row.ENT_ID})
MATCH (b:Entity {ENT_ID : row.PARENTID}) 
MERGE (a)-[:PARENT]->(b);

Why do you create the relationships as nodes and not as relationships, that doesn't make sense to me ???

Instead of:

USING PERIODIC COMMIT 
LOAD CSV WITH HEADERS FROM "file:///C:/.../EntitiesRelationships.txt" AS row FIELDTERMINATOR '|' 
CREATE (n:Rships {RID: row.RID,Ent_IDParent: row.Ent_IDParent, Ent_IDChild: row.Ent_IDChild, RelationID: row.RelationID });

I would do:

USING PERIODIC COMMIT 
LOAD CSV WITH HEADERS FROM "file:///C:/.../EntitiesRelationships.txt" AS row FIELDTERMINATOR '|' 
MATCH (a:Entity {ENT_ID : row.Ent_IDChild})
MATCH (b:Entity {ENT_ID : row.Ent_IDParent}) 
CREATE (a)-[:PARENT {RID: row.RID, RelationID: row.RelationID}]->(b);

I created index on both the CSVs:

CREATE INDEX ON :Entity(ENT_ID)

CREATE INDEX ON :Rships(Ent_IDParent)
CREATE INDEX ON :Rships(Ent_IDChild)

Till here the above code runs fine although it takes nearly a day to do so:

Now, when I try to create relationship using the query:

// you also had a typo in this query, matching b also for the parent-id

USING PERIODIC COMMIT 
LOAD CSV WITH HEADERS FROM "file:///C:/.../EntitiesRelationships.txt" AS row FIELDTERMINATOR '|' 
MATCH(a:Rships {Ent_IDParent: row.ENT_IDParent}) 
MATCH(b:Rships {Ent_IDChild: row.ENT_IDChild}) 
MERGE (a)-[:PARENT]->(b)

These two statements make no sense at all, you create random cross products, profile should show you a huge blow up in rows and database-hits like billions

MATCH(Entity) MATCH (a:Entity {ENT_ID : Entity.ENT_ID}) WITH Entity, a MATCH (b:Entity {ENT_ID : Entity.PARENTID}) WITH a,b MERGE (a)-[r:RELATION]->(b)

Match(Rships) MATCH(a:Rships {ENT_IDParent: Rships.ENT_IDParent}) WITH Rships, a MATCH(b:Rships {ENT_IDParent: Rships.ENT_IDChild}) WITH a,b MERGE (a)-[r:RELATION]->(b)

Upvotes: 1

Related Questions