Reputation: 1877
I am reading 5gb of XML file using the below code and processing that data into Database using spring dataJpa, this below is just sample logic we are closing the inpustream and aswell xsr object.
XMLInputFactory xf=XMLInputFactory.newInstance();
XMLStreamReader xsr=xf.createXMLStreamReader(new InputStreamReader(new FileInputStream("test.xml"))
i have configured the max 8GB(ie -xms7000m and -xmx8000m) of heap memory But its getting the below hibernate heap issue when saving the data. it inserted around 700000 data total of 2100000
[dispatcherServlet] : Servlet.service() for servlet [dispatcherServlet] in context with path [] threw exception [Handler dispatch failed; nested exception is java.lang.OutOfMemoryError: Java heap space] with root cause
java.lang.OutOfMemoryError: Java heap space
at java.base/java.util.IdentityHashMap.resize(IdentityHashMap.java:472) ~[na:na]
at java.base/java.util.IdentityHashMap.put(IdentityHashMap.java:441) ~[na:na]
at org.hibernate.event.internal.DefaultPersistEventListener.entityIsPersistent(DefaultPersistEventListener.java:159) ~[hibernate-core-5.4.22.Final.jar:5.4.22.Final]
at org.hibernate.event.internal.DefaultPersistEventListener.onPersist(DefaultPersistEventListener.java:124) ~[hibernate-core-5.4.22.Final.jar:5.4.22.Final]
at org.hibernate.internal.SessionImpl$$Lambda$1620/0x00000008010a3040.applyEventToListener(Unknown Source) ~[na:na]
at org.hibernate.event.service.internal.EventListenerGroupImpl.fireEventOnEachListener(EventListenerGroupImpl.java:113) ~[hibernate-core-5.4.22.Final.jar:5.4.22.Final]
at org.hibernate.internal.SessionImpl.persistOnFlush(SessionImpl.java:765) ~[hibernate-core-5.4.22.Final.jar:5.4.22.Final]
at org.hibernate.engine.spi.CascadingActions$8.cascade(CascadingActions.java:341) ~[hibernate-core-5.4.22.Final.jar:5.4.22.Final]
at org.hibernate.engine.internal.Cascade.cascadeToOne(Cascade.java:492) ~[hibernate-core-5.4.22.Final.jar:5.4.22.Final]
at org.hibernate.engine.internal.Cascade.cascadeAssociation(Cascade.java:416) ~[hibernate-core-5.4.22.Final.jar:5.4.22.Final]
at org.hibernate.engine.internal.Cascade.cascadeProperty(Cascade.java:218) ~[hibernate-core-5.4.22.Final.jar:5.4.22.Final]
at org.hibernate.engine.internal.Cascade.cascadeCollectionElements(Cascade.java:525) ~[hibernate-core-5.4.22.Final.jar:5.4.22.Final]
at org.hibernate.engine.internal.Cascade.cascadeCollection(Cascade.java:456) ~[hibernate-core-5.4.22.Final.jar:5.4.22.Final]
at org.hibernate.engine.internal.Cascade.cascadeAssociation(Cascade.java:419) ~[hibernate-core-5.4.22.Final.jar:5.4.22.Final]
at org.hibernate.engine.internal.Cascade.cascadeProperty(Cascade.java:218) ~[hibernate-core-5.4.22.Final.jar:5.4.22.Final]
at org.hibernate.engine.internal.Cascade.cascade(Cascade.java:151) ~[hibernate-core-5.4.22.Final.jar:5.4.22.Final]
at org.hibernate.event.internal.AbstractFlushingEventListener.cascadeOnFlush(AbstractFlushingEventListener.java:158) ~[hibernate-core-5.4.22.Final.jar:5.4.22.Final]
at org.hibernate.event.internal.AbstractFlushingEventListener.prepareEntityFlushes(AbstractFlushingEventListener.java:148) ~[hibernate-core-5.4.22.Final.jar:5.4.22.Final]
at org.hibernate.event.internal.AbstractFlushingEventListener.flushEverythingToExecutions(AbstractFlushingEventListener.java:81) ~[hibernate-core-5.4.22.Final.jar:5.4.22.Final]
at org.hibernate.event.internal.DefaultFlushEventListener.onFlush(DefaultFlushEventListener.java:39) ~[hibernate-core-5.4.22.Final.jar:5.4.22.Final]
at org.hibernate.internal.SessionImpl$$Lambda$1597/0x0000000801076040.accept(Unknown Source) ~[na:na]
at org.hibernate.event.service.internal.EventListenerGroupImpl.fireEventOnEachListener(EventListenerGroupImpl.java:102) ~[hibernate-core-5.4.22.Final.jar:5.4.22.Final]
at org.hibernate.internal.SessionImpl.doFlush(SessionImpl.java:1362) ~[hibernate-core-5.4.22.Final.jar:5.4.22.Final]
at org.hibernate.internal.SessionImpl.managedFlush(SessionImpl.java:453) ~[hibernate-core-5.4.22.Final.jar:5.4.22.Final]
at org.hibernate.internal.SessionImpl.flushBeforeTransactionCompletion(SessionImpl.java:3212) ~[hibernate-core-5.4.22.Final.jar:5.4.22.Final]
at org.hibernate.internal.SessionImpl.beforeTransactionCompletion(SessionImpl.java:2380) ~[hibernate-core-5.4.22.Final.jar:5.4.22.Final]
at org.hibernate.engine.jdbc.internal.JdbcCoordinatorImpl.beforeTransactionCompletion(JdbcCoordinatorImpl.java:447) ~[hibernate-core-5.4.22.Final.jar:5.4.22.Final]
at org.hibernate.resource.transaction.backend.jdbc.internal.JdbcResourceLocalTransactionCoordinatorImpl.beforeCompletionCallback(JdbcResourceLocalTransactionCoordinatorImpl.java:183) ~[hibernate-core-5.4.22.Final.jar:5.4.22.Final]
at org.hibernate.resource.transaction.backend.jdbc.internal.JdbcResourceLocalTransactionCoordinatorImpl.access$300(JdbcResourceLocalTransactionCoordinatorImpl.java:40) ~[hibernate-core-5.4.22.Final.jar:5.4.22.Final]
at org.hibernate.resource.transaction.backend.jdbc.internal.JdbcResourceLocalTransactionCoordinatorImpl$TransactionDriverControlImpl.commit(JdbcResourceLocalTransactionCoordinatorImpl.java:281) ~[hibernate-core-5.4.22.Final.jar:5.4.22.Final]
at org.hibernate.engine.transaction.internal.TransactionImpl.commit(TransactionImpl.java:101) ~[hibernate-core-5.4.22.Final.jar:5.4.22.Final]
at org.springframework.orm.jpa.JpaTransactionManager.doCommit(JpaTransactionManager.java:534) ~[spring-orm-5.2.10.RELEASE.jar:5.2.10.RELEASE]
As per the above trace logs it seems some issue with hibernate save casacade , but not able to figured out , below are the entity classes which uses to save the data in databse.
@Data
@EqualsAndHashCode
@Builder(toBuilder = true)
@NoArgsConstructor(access = AccessLevel.PRIVATE)
@AllArgsConstructor(access = AccessLevel.PRIVATE)
public class UMEnityPK implements Serializable {
private static final long serialVersionUID=1L;
private String batchId;
private Long batchVersion;
private BigInteger umId;
}
@Data
@Builder(toBuilder = true)
@NoArgsConstructor(access = AccessLevel.PRIVATE)
@AllArgsConstructor(access = AccessLevel.PRIVATE)
@EqualsAndHashCode(of = {"batchId", "batchVersion", "umId"})
@Entity
@Table(name ="um_base")
@IdClass(UMEnityPK.class)
public class UMBase {
@Id private String batchId;
@Id private Long batchVersion;
@Id private BigInteger umId;
private String firstName;
private String lastName;
private String umType;
private String umLevel;
@OneToMany(mappedBy = "umBase", cascade = CascadeType.ALL)
private List<UMAddress> umAddresses;
@OneToMany(mappedBy = "umBase", cascade = CascadeType.ALL)
private List<UMIdentifier> umIdentifiers;
@OneToOne(mappedBy = "umBase", cascade = CascadeType.ALL)
private UMHierarchy umHierarchy;
}
@Data
@Builder(toBuilder = true)
@NoArgsConstructor(access = AccessLevel.PRIVATE)
@AllArgsConstructor(access = AccessLevel.PRIVATE)
@EqualsAndHashCode(of = {"id"})
@Entity
@Table(name = "um_identifier")
public class UMIdentifier {
@Id
@GeneratedValue(strategy = GenerationType.SEQUENCE, generator = "um_address")
@SequenceGenerator(name = "um_address", sequenceName = "SEQ_UM_ADDRESS", allocationSize = 1)
private Long id;
private String idValue;
private String idType;
private String groupType;
@ManyToOne
@JoinColumns({
@JoinColumn(name = "BATCH_ID", referencedColumnName = "batchId"),
@JoinColumn(name = "BATCH_VERSION", referencedColumnName = "batchVersion"),
@JoinColumn(name = "UM_ID", referencedColumnName = "umId")
})
private UMBase umBase;
}
@Data
@Builder(toBuilder = true)
@NoArgsConstructor(access = AccessLevel.PRIVATE)
@AllArgsConstructor(access = AccessLevel.PRIVATE)
@EqualsAndHashCode(of = {"batchId", "batchVersion", "umId"})
@Entity
@Table(name ="um_hierarchy")
public class UMHierarchy {
@Id
private String batchId;
@Id private Long batchVersion;
@Id private BigInteger umId;
private String hierarchyTpe;
private String umStatusCode;
private String immediateParentId;
private Date hierarchyDate;
@OneToOne(cascade = CascadeType.ALL)
@JoinColumns({
@JoinColumn(name = "BATCH_ID", referencedColumnName = "batchId"),
@JoinColumn(name = "BATCH_VERSION", referencedColumnName = "batchVersion"),
@JoinColumn(name = "UM_ID", referencedColumnName = "umId")
})
private UMBase umBase;
}
@Data
@Builder(toBuilder = true)
@NoArgsConstructor(access = AccessLevel.PRIVATE)
@AllArgsConstructor(access = AccessLevel.PRIVATE)
@EqualsAndHashCode(of = {"id"})
@Entity
@Table(name = "um_address")
public class UMAddress {
@Id
@GeneratedValue(strategy = GenerationType.SEQUENCE, generator = "um_address")
@SequenceGenerator(name = "um_address", sequenceName = "SEQ_UM_ADDRESS", allocationSize = 1)
private Long id;
private String addressType;
private String addressLine1;
private String getAddressLine2;
private String city;
private String state;
private String postalCode;
private String country;
@ManyToOne
@JoinColumns({
@JoinColumn(name = "BATCH_ID", referencedColumnName = "batchId"),
@JoinColumn(name = "BATCH_VERSION", referencedColumnName = "batchVersion"),
@JoinColumn(name = "UM_ID", referencedColumnName = "umId")
})
private UMBase umBase;
}
Is there any issues with hibernate entity mapping which eating of the memory
Upvotes: 2
Views: 7774
Reputation: 1877
After checking the heap dump, the issue is with the org.hibernate.engine.StatefulPersistenceContext -> org.hibernate.util.IdentityMap
memory leak , so used the below way and worked fine , create a custom JPARepository and have below sample method logic.
public <S extends T> void saveInBatch(Iterable<S> entities) {
if (entities == null) {
return;
}
EntityManager entityManager = entityManagerFactory.createEntityManager();
EntityTransaction entityTransaction = entityManager.getTransaction();
try {
entityTransaction.begin();
int i = 0;
for (S entity : entities) {
if (i % batchSize == 0 && i > 0) {
entityTransaction.commit();
entityTransaction.begin();
entityManager.clear();
}
entityManager.persist(entity);
i++;
}
entityTransaction.commit();
} catch (RuntimeException e) {
if (entityTransaction.isActive()) {
entityTransaction.rollback();
}
throw e;
} finally {
entityManager.close();
}
}
}
Upvotes: 1
Reputation: 5095
When dealing with converting datasets that big you need to do that in batches. Read 100 records from the xml, convert them to entities, save each of them with em.persist(record)
, then call em.flush()
and em.clear()
to remove them from Hibernate, then clear them from your local collection, then manually invoke the garbage collector using System.gc()
. You may even want to use Hibernate's batch processing as described in this tutorial.
In pseudocode this would be:
boolean finished = false;
List<Entity> locals = new ArrayList<>(100);
while (!finished) {
for (int records = 0; records < 100; records++) {
Entity ent = readEntityFrom(xml);
// readEntity function must return null when no more remain to read
if (ent == null) {
finished = true;
break;
}
locals.add(ent);
}
for (Entity ent : locals) em.persist(ent);
em.flush(); // send any that are still waiting to the database
em.clear(); // remove references Hibernate holds to these entities
locals.clear(); // remove references we hold to these entities
// now all these entity references are weak and can be garbage-collected
System.gc(); // purge them from memory
}
Also you may want to manually begin and commit a transaction around each insert loop to ensure the database isn't holding everything for your entire import, or it might run out of memory instead of your java application.
Upvotes: 1