Reputation: 498
I have a large number of classes that have been generated and annotated using the EMF/Texo combination. I persist them on a SQL Server Database using JPA/Eclipselink.
This works fine, however the performance is horrible when a high number of objects need to be persisted. So I've written two test cases (see TestBulkInserts.java
), which compare the performance of a bulk insert using the framework (foo
) with a plain JDBC bulk insert (bar
).
When inserting 10000 Objects, which is a bulk insert below the average size. foo()
and bar()
give the following times:
Duration JPA/Texo: 19.620ms
Duration plain JDBC: 892ms
I'm wondering why there is such a huge difference (more then a factor 20!). With larger sizes it even get's worse.
The DatabaseObject
class extends the PersistableObjectClass.java
(see below), and both are generated (including the respective DAO class) with Texo + EMF.
I haven't added any particular settings in the persistence.xml
, except the necessary connection details.
TestBulkInserts.java:
import java.sql.Connection;
import java.sql.Date;
import java.sql.DriverManager;
import java.sql.PreparedStatement;
...
import com.ownproject.loader.generated.DbModelPackage;
import com.ownproject.loader.DatabaseObject;
import com.ownproject.loader.dao.DatabaseObjectDao;
import javax.persistence.Persistence;
import org.eclipse.emf.texo.server.store.EntityManagerProvider;
import org.junit.Test;
public class TestBulkInserts {
private static final int NUM_LOOPS = 10000;
@Test
public void foo() {
TestMethods.connectTestDBandEMF();
// basically does this
// DbModelPackage.initialize();
// EntityManagerProvider.getInstance().setEntityManagerFactory(Persistence.createEntityManagerFactory(PERSISTENCE_UNIT_TEST));
Stopwatch sw = Stopwatch.createStarted();
DatabaseObjectDao dao = new DatabaseObjectDao();
dao.getEntityManager().getTransaction().begin();
for (int i = 0; i < NUM_LOOPS; i++) {
DatabaseObject dbo = new DatabaseObject();
dbo.setString(UUID.randomUUID().toString());
dbo.setInsert_time(Date.valueOf(LocalDate.now()));
dao.insert(dbo);
}
dao.getEntityManager().getTransaction().commit();
sw.stop();
System.out.println(String.format("Duration JPA/Texo: %,dms", sw.elapsed(TimeUnit.MILLISECONDS)));
}
@Test
public void bar() throws ClassNotFoundException, SQLException {
Class.forName("com.microsoft.sqlserver.jdbc.SQLServerDriver");
String connectionUrl = "jdbc:sqlserver://hostname:1433;databaseName=local_test;user=sa;password=blablub;";
Connection con = DriverManager.getConnection(connectionUrl);
con.setAutoCommit(false);
Stopwatch sw = Stopwatch.createStarted();
PreparedStatement insertStatement = con.prepareStatement("INSERT INTO DatabaseObject(b_id, insert_time) VALUES (?, ?)");
for (int i = 0; i < NUM_LOOPS; i++) {
insertStatement.setString(1, UUID.randomUUID().toString());
insertStatement.setDate(2, Date.valueOf(LocalDate.now()));
insertStatement.addBatch();
}
insertStatement.executeBatch();
con.commit();
con.close();
sw.stop();
System.out.println(String.format("Duration plain JDBC: %,dms", sw.elapsed(TimeUnit.MILLISECONDS)));
}
}
PersistableObjectClass.java:
import javax.persistence.Basic;
...
import javax.persistence.TemporalType;
@Entity(name = "PersistableObjectClass")
@MappedSuperclass()
@Inheritance(strategy = InheritanceType.TABLE_PER_CLASS)
public abstract class PersistableObjectClass {
@Basic()
@Temporal(TemporalType.TIMESTAMP)
private Date insert_time = null;
@Id()
@GeneratedValue(strategy = GenerationType.IDENTITY)
private int s_id = 0;
...
}
Upvotes: 2
Views: 1182
Reputation: 153780
Not only that you need to use batch updates, but you also need to ensure that the transaction is committed at regular intervals, as otherwise, you'd run into a long-running transaction, which is bad on both 2PL or MVCC database engines.
So, this is how you batch job should look like:
int entityCount = 50;
int batchSize = 25;
EntityManager entityManager = null;
EntityTransaction transaction = null;
try {
entityManager = entityManagerFactory()
.createEntityManager();
transaction = entityManager.getTransaction();
transaction.begin();
for ( int i = 0; i < entityCount; ++i ) {
if ( i > 0 && i % batchSize == 0 ) {
entityManager.flush();
entityManager.clear();
transaction.commit();
transaction.begin();
}
Post post = new Post(
String.format( "Post %d", i + 1 )
);
entityManager.persist( post );
}
transaction.commit();
} catch (RuntimeException e) {
if ( transaction != null &&
transaction.isActive()) {
transaction.rollback();
}
throw e;
} finally {
if (entityManager != null) {
entityManager.close();
}
}
Upvotes: 3