Reputation: 83
I have 2 CSV files exported from mysql :
# Disease CSV Headers # Disease Master ID (autoincrement, pk) disease_name
# Tests CSV Headers #
Test Master
ID (autoincrement, pk),
test_name,
parent_disease_ID (points to ID column in Disease.
Master tbl)
I run following cypher commands :
LOAD CSV WITH HEADERS FROM.
"http://localhost/disease_mstr.csv" AS line
MERGE (d:Disease {did: toInteger(line.ID), diseasename:
line.disease_name})
LOAD CSV WITH HEADERS FROM.
"http://localhost/test_mstr.csv" AS line
MERGE (d:Tests {tid: toInteger(line.ID), testname:
line.test_name, did: toInteger(line.parent_disease_ID)})
MATCH (t:Tests), (d:Disease) CREATE (t:Tests)-
[r:TEST_FOR]->(d:Disease) RETURN t, r, d
Above cypher returns one disease connected to many tests whereas i want just the reverse! Can someone please correct me?
Upvotes: 0
Views: 61
Reputation: 67044
[EDITED]
In the query that creates the relationships, you need to filter for Tests
and Disease
nodes that share the same did
value:
MATCH (t:Tests), (d:Disease)
WHERE t.did = d.did
MERGE (t)-[r:TEST_FOR]->(d)
RETURN t, r, d;
This query also replaced CREATE
with MERGE
to avoid creating duplicate TEST_FOR
relationships between the same t
and d
pair. If you already have such duplicate relationships, delete them first.
Also, for efficiency, you should consider creating in index on either :Disease(did)
or :Tests(did)
-- whichever has the most instances.
Upvotes: 0
Reputation: 11216
You could create the disease nodes, test nodes and relationship between test and disease nodes in one pass of the tests file.
LOAD CSV WITH HEADERS
FROM "http://localhost/test_mstr.csv"
AS line
MERGE (disease:Disease {did: toInteger(line.parent_disease_ID)})
MERGE (test:Tests {tid: toInteger(line.ID), testname:
line.test_name})
MERGE (test)-[r:TEST_FOR]->(did)
And then update the disease names after the fact in a second pass.
LOAD CSV WITH HEADERS
FROM "http://localhost/disease_mstr.csv" AS line
MERGE (d:Disease {did: toInteger(line.ID)})
SET d.diseasename = line.disease_name
Upvotes: 1