Reputation: 3
Build 3 Node cluster in testing environment and used Neo4j-JDBC connection to save JSON data into Neo4j.
When creating just 2000 nodes and 2000 relations through JSON statistics are: Total time to save topology data in Neo4j: 456688 ms and links size: 2000, nodes size: 2000.
Saved without checking duplicacy of nodes/relations(Removed checkVertex and checkRelation methods):
Total time to save topology data in Neo4j: 446979 ms and links size: 2000, nodes size: 4000 (As we are not checking duplicacy, double nodes has been created).
Code:
public Connection getConnection(String masterNodeIp, String password) throws Exception {
return(Connection)DriverManager.getConnection("jdbc:neo4j:http://"+masterNodeIp+"/?user=neo4j,password="+password+"");
}
//By iterating through edges, Added source and target nodes.
try {
for (Links link : topology.getL2links()) {
if(conn != null) {
long srcId = etGraphIdByUniquenessOfOrphan(clientId,link.getSrcMgmtIP());
GraphId srcGraphId = prepareGraphId(srcId, "DEVICE");
long tgtId = etGraphIdByUniquenessOfOrphan(clientId,link.getTgtMgmtIP());
GraphId tgtGraphId = prepareGraphId(tgtId, "DEVICE");
String srcQuery = createNode(conn, link, false,clientId,discProfileId,
srcGraphId);
if(srcQuery!=null && !srcQuery.isEmpty())
stmt.execute(srcQuery);
String tgtQuery = createNode(conn, link, true,clientId,discProfileId,
tgtGraphId);
if(tgtQuery != null && !tgtQuery.isEmpty())
stmt.execute(tgtQuery);
String relationQuery = processRelation(conn, link,srcGraphId,tgtGraphId);
if(relationQuery!=null && !relationQuery.isEmpty())
stmt.execute(relationQuery);
}
}
} catch(Exception e) {
System.out.println("Exception in processJsonData ::: "+e.getMessage());
throw e;
} finally {
stmt.close();
conn.close();
}
//Before creating node checked whether node is already existed or not in order to avoid duplicacy
private boolean checkVertex(Connection conn, String ip, String hostName, long clientId, long discPId, GraphId graphId) throws Exception{
Statement stmt = null;
ResultSet rs = null;
boolean result=false;
try {
stmt = conn.createStatement();
StringBuffer queryBuffer = new StringBuffer();
queryBuffer.append(" MATCH (node) WHERE node.id ='"+graphId.getId()+"' AND node.sourceType = '"+graphId.getSourceType()+"'");
queryBuffer.append(" RETURN node");
rs = (ResultSet) stmt.executeQuery(queryBuffer.toString());
while(rs.next()) {
result=true;
break;
}
} catch(Exception e) {
System.out.println("Exception in fetching node ::: "+e.getMessage());
throw e;
} finally {
rs.close();
stmt.close();
}
return result;
}
//Before creating Relation also checked duplicacy for relationships.
private boolean checkRelation(Connection conn, Links link, GraphId srcGraphId, GraphId tgtGraphId) throws SQLException {
Statement stmt = null;
ResultSet rs = null;
boolean result=false;
try {
stmt = conn.createStatement();
StringBuffer queryBuffer = new StringBuffer();
queryBuffer.append(" MATCH (src:resource)-[r:topology]->(tgt:resource) WHERE src.id='"+srcGraphId.getId()
+"' AND tgt.id='"+tgtGraphId.getId()+"' AND r.srcInt='"+link.getSrcInt()+"'AND r.tgtInt='"+link.getTgtInt()+"'");
queryBuffer.append(" RETURN r");
rs=(ResultSet) stmt.executeQuery(queryBuffer.toString());
while(rs.next()) {
result=true;
break;
}
}
catch(Exception e) {
System.out.println("Exception in fetching node ::: "+e.getMessage());
} finally {
rs.close();
stmt.close();
}
return result;
}
We created indexes for those duplicacy check queries but still performance is slow.
And also please let us know how to use "Node key" unique constraint in Java level so that we can skip once checkVertex query. We tried to catch "constraintViolationexception" and added log instead of throwing it but it's throwing exception not saving any nodes.
Upvotes: 0
Views: 291
Reputation: 41676
There are a lot of things that you can improve:
For Batching: https://medium.com/@mesirii/5-tips-tricks-for-fast-batched-updates-of-graph-structures-with-neo4j-and-cypher-73c7f693c8cc
For parameters: http://neo4j-contrib.github.io/neo4j-jdbc/#_minimum_viable_snippet
Upvotes: 4