Reputation: 4645
I wanted to test insert speed with the latest spring-data neo4j 4 . I modified the movies example to make things simple and comparable.
Try running the test class: movies.spring.data.neo4j.repositories.PersonRepositoryTest here.
It takes 5sec to add 400 nodes in this example. https://github.com/fodon/neo4j-spring-data-speed-demo
This is a speed test with the older version of neo4j https://github.com/fodon/gs-accessing-data-neo4j-speed
The hello.Application class is about 40x faster than spring-data-neo4j-4 for the same job.
Why is spring-data-neo4j-4 slower than the older version? How can it be sped up?
Upvotes: 1
Views: 949
Reputation: 20175
A call to save()
is actually a direct persitence request against the database. There is currently no notion of defering save() calls.
By turning on query logging by adding a logback-test.xml
file to your test resources :
<?xml version="1.0" encoding="UTF-8"?>
<configuration>
<appender name="console" class="ch.qos.logback.core.ConsoleAppender">
<encoder>
<pattern>%d %5p %40.40c:%4L - %m%n</pattern>
</encoder>
</appender>
<logger name="org.neo4j.ogm" level="info" />
<root level="info">
<appender-ref ref="console" />
</root>
</configuration>
You can see that for each Person.save()
it will actually make 3 requests :
-
2016-07-25 05:27:51,093 INFO drivers.embedded.request.EmbeddedRequest: 155 - Request: UNWIND {rows} as row CREATE (n:`Car`) SET n=row.props RETURN row.nodeRef as nodeRef, ID(n) as nodeId with params {rows=[{nodeRef=-590487524, props={type=f27dc1bac12a480}}, {nodeRef=-1760792732, props={type=41ff5d3a69b4a5b4}}, {nodeRef=-637840556, props={type=3e7e77ca5e406a21}}]}
2016-07-25 05:27:54,117 INFO drivers.embedded.request.EmbeddedRequest: 155 - Request: UNWIND {rows} as row CREATE (n:`Person`) SET n=row.props RETURN row.nodeRef as nodeRef, ID(n) as nodeId with params {rows=[{nodeRef=-1446435394, props={name=bafd7ad2721516f8}}]}
2016-07-25 05:27:54,178 INFO drivers.embedded.request.EmbeddedRequest: 155 - Request: UNWIND {rows} as row MATCH (startNode) WHERE ID(startNode) = row.startNodeId MATCH (endNode) WHERE ID(endNode) = row.endNodeId MERGE (startNode)-[rel:`HAS`]->(endNode) RETURN row.relRef as relRefId, ID(rel) as relId with params {rows=[{startNodeId=3, relRef=-712176789, endNodeId=0}, {startNodeId=3, relRef=-821487247, endNodeId=1}, {startNodeId=3, relRef=-31523689, endNodeId=2}]}
The performance would be better if instead the statement for the Person creation would just use as parameter the 100 persons at once, and same for the Car objects.
As of now there is no native out of the box feature in the OGM (opened issue : https://github.com/neo4j/neo4j-ogm/issues/208
However, you can batch them by saving
a collection instead of one by one :
@Test
@DirtiesContext
public void speedTest2() {
SessionFactory sessionFactory = new SessionFactory("hello.neo.domain");
Session session = sessionFactory.openSession();
Random rand = new Random(10);
System.out.println("Before linking up with Neo4j...");
long start = System.currentTimeMillis();
long mark = start;
for (int j = 0; j < 10; j++) {
List<Person> batch = new ArrayList<>();
for (int i = 0; i < 100; i++) {
Person greg = new Person(rand);
batch.add(greg);
}
session.save(batch);
long now = System.currentTimeMillis();
System.out.format("%d : Time:%d\n", j, now - mark);
mark = now;
}
}
You can see that the results difference is very impressive:
Not initialzing DB.
Before linking up with Neo4j...
0 : Time:7318
1 : Time:1731
2 : Time:1555
3 : Time:1481
4 : Time:1237
5 : Time:1176
6 : Time:1101
7 : Time:1094
8 : Time:1114
9 : Time:1015
Not initialzing DB.
Before linking up with Neo4j...
0 : Time:494
1 : Time:272
2 : Time:230
3 : Time:442
4 : Time:320
5 : Time:247
6 : Time:284
7 : Time:288
8 : Time:366
9 : Time:222
Upvotes: 5