Reputation: 436
I am creating nodes in Neo4j using neo4j-java driver
with the help of following Cipher Query.
String cipherQuery = "CREATE (n:MLObsTemp { personId: " + personId + ",conceptId: " + conceptId
+ ",obsId: " + obsId + ",MLObsId: " + mlObsId + ",encounterId: " + encounterId + "}) RETURN n";
Function for creating query
createNeo4JObsNode(String cipherQuery);
Implementation of the Function
private void createNeo4JObsNode(String cipherQuery) throws Exception {
try (ConNeo4j greeter = new ConNeo4j("bolt://localhost:7687", "neo4j", "qwas")) {
System.out.println("Executing query : " + cipherQuery);
try (Session session = driver.session()) {
StatementResult result = session.run(cipherQuery);
} catch (Exception e) {
System.out.println("Error" + e.getMessage());
}
} catch (Exception e) {
e.printStackTrace();
}
}
Making relation for the above nodes using below code
String obsMatchQuery = "MATCH (m:MLObsTemp),(o:Obs) WHERE m.obsId=o.obsId CREATE (m)-[:OBS]->(o)";
createNeo4JObsNode(obsMatchQuery);
String personMatchQuery = "MATCH (m:MLObsTemp),(p:Person) WHERE m.personId=p.personId CREATE (m)-[:PERSON]->(p)";
createNeo4JObsNode(personMatchQuery);
String encounterMatchQuery = "MATCH (m:MLObsTemp),(e:Encounter) WHERE m.encounterId=e.encounterId CREATE (m)-[:ENCOUNTER]->(e)";
createNeo4JObsNode(encounterMatchQuery);
String conceptMatchQuery = "MATCH (m:MLObsTemp),(c:Concept) WHERE m.conceptId=c.conceptId CREATE (m)-[:CONCEPT]->(c)";
createNeo4JObsNode(conceptMatchQuery);
It is taking me 13 seconds on average for creating nodes and 12 seconds for making relations. I have 350k records in my database for which I have to create nodes and their respective relations.
How can I improve my code? Moreover, is this the best way for creating nodes in Neo4j using bolt server and neo4j-java driver?
I am now using the query parameter in my code
HashMap<String, Object> parameters = new HashMap<String, Object>();
((HashMap<String, Object>) parameters).put("personId", 1390);
((HashMap<String, Object>) parameters).put("obsId", 14001);
((HashMap<String, Object>) parameters).put("conceptId", 5978);
((HashMap<String, Object>) parameters).put("encounterId", 10810);
((HashMap<String, Object>) parameters).put("mlobsId", 2);
String cypherQuery=
"CREATE (m:MLObsTemp { personId: $personId, ObsId: $obsId, conceptId: $conceptId, MLObsId: $mlobsId, encounterId: $encounterId}) "
+ "WITH m MATCH (p:Person { personId: $personId }) CREATE (m)-[:PERSON]->(p) "
+ "WITH m MATCH (e:Encounter {encounterId: $encounterId }) CREATE (m)-[:Encounter]->(e) "
+ "WITH m MATCH (o:Obs {obsId: $obsId }) CREATE (m)-[:OBS]->(o) "
+ "WITH m MATCH (c:Concept {conceptId: $conceptId }) CREATE (m)-[:CONCEPT]->(c) "
+ " RETURN m";
Creating Node function
try {
ConNeo4j greeter = new ConNeo4j("bolt://localhost:7687", "neo4j", "qwas");
try {
Session session = driver.session();
StatementResult result = session.run(cypherQuery, parameters);
System.out.println(result);
} catch (Exception e) {
System.out.println("[WARNING] Null Row");
}
} catch (Exception e) {
e.printStackTrace();
}
I am also performing the indexing in order to speed up the process
CREATE CONSTRAINT ON (P:Person) ASSERT P.personId IS UNIQUE
CREATE CONSTRAINT ON (E:Encounter) ASSERT E.encounterId IS UNIQUE
CREATE CONSTRAINT ON (O:Obs) ASSERT O.obsId IS UNIQUE
CREATE CONSTRAINT ON (C:Concept) ASSERT C.conceptId IS UNIQUE
Here is the plan for 1 cypher query-profile
Now the performance has improved but not significant. I am using neo4j-java-driver version 1.6.1. How can I batch my cipher queries to improve the performance further.
Upvotes: 0
Views: 585
Reputation: 8833
You should try to minimize redundant work in your cyphers.
MLObsTemp has a lot of redundant properties, and you are searching for it to create every link. Relationships defeat the need to create properties for foreign keys (node ids)
I would recommend a Cypher that does everything together, and uses parameters like this...
CREATE (m:MLObsTemp)
WITH m MATCH (p:Person {id:"$person_id"}) CREATE (m)-[:PERSON]->(p)
WITH m MATCH (e:Encounter {id:"$encounter_id"}) CREATE (m)-[:Encounter]->(e)
WITH m MATCH (c:Concept {id:"$concept_id"}) CREATE (m)-[:CONCEPT]->(c)
// SNIP more MATCH/CREATE
RETURN m
This way, Neo4j doesn't have to find m repeatedly for every relationship. You don't need the ID properties, because that is effectively what the relationship you just created is. Neo4j is very efficient at walking edges (relationships), so just follow the relationship if you need the id value.
TIPS: (mileage may very across Neo4j versions)
MATCH (n{id:"rawr"})
vs MATCH (n) WHERE n.id="rawr"
)$thing_id
syntax used in the above query.) Also, It protects you from Cypher injection (See SQL injection)Upvotes: 2