Reputation: 5113
We've set up a Spring Batch batch email-sender job that does the following:
This generally works fine, but we've discovered a massive memory leak. After 9 hours or so, can see a single FlowJob
, but 748 JobExecution
(all 'STARTED'), each holding 778 (!) StepExecution
instances. All in all, 900 MB of stuff.
Here's the Spring config (Spring 3.1, Spring Batch 2.1.9):
<bean id="jobRepository"
class="org.springframework.batch.core.repository.support.MapJobRepositoryFactoryBean" />
<bean id="jobLauncher"
class="org.springframework.batch.core.launch.support.SimpleJobLauncher">
<property name="jobRepository" ref="jobRepository" />
<property name="taskExecutor">
<bean class="org.springframework.core.task.SimpleAsyncTaskExecutor"/>
</property>
</bean>
<bean id="jmsEmailFetcher" class="org.springframework.batch.item.jms.JmsItemReader">
<property name="jmsTemplate" ref="batchEmailJmsTemplate" />
</bean>
<bean id="passthroughProcessor" class="org.springframework.batch.item.support.PassThroughItemProcessor" />
<bean id="transactionManager"
class="org.springframework.batch.support.transaction.ResourcelessTransactionManager" />
<!-- The Spring Batch *Limiter/Decider* -->
<bean id="ourLimitDecider" class="our.special.InfiniteThrottledExecutionDecider" />
<!-- The Spring Batch *Job* -->
<batch:job id="fetchEmailsJobBatch" restartable="true">
<batch:step id="fetchEmailsStep" next="limitDecision">
<batch:tasklet throttle-limit="10">
<batch:chunk reader="jmsEmailFetcher" processor="passthroughProcessor"
writer="batchEmailService.javaMailItemWriter" commit-interval="100" skip-limit="999999999"> <!-- Might gets *lots* of MQ-not-up-yet fails -->
<batch:skippable-exception-classes>
<batch:include class="org.springframework.jms.JmsException" /> <!-- Generally MQ-not-up-yet ConnectionException, or Session-closed (in tests) -->
<batch:include class="java.lang.IllegalStateException" /> <!-- Yuk, usually/presumably a test SMTP server isn't yet connected -->
<batch:include class="org.springframework.mail.MailSendException" /> <!-- SMTP down... -->
</batch:skippable-exception-classes>
</batch:chunk>
</batch:tasklet>
</batch:step>
<batch:decision id="limitDecision" decider="ourLimitDecider">
<batch:next on="CONTINUE" to="fetchEmailsStep" />
<batch:end on="COMPLETED" />
</batch:decision>
</batch:job>
Our InfiniteThrottledExecutionDecider
basically returns new FlowExecutionStatus("CONTINUE")
every time, to ensure that the fetchEmailsStep
gets executed at the end of the flow, and that the Job never completes - at least not until we're ready to stop it ourselves.
We're not using a database, so we expect some stuff to be held in memory, but not a complete record of everything that's ever been run...
Is there something wrong with our configuration? Or our approach?
Here's a bit from our log file for what we're told is Step #778 and what should be the one and only Job instance.
23:58:18,782 - INFO (org.springframework.batch.core.job.SimpleStepHandler) - Duplicate step [fetchEmailsStep] detected in execution of job=[fetchEmailsJobBatch]. If either step fails, both will be executed again on restart.
23:59:52,257 - INFO (org.springframework.batch.core.job.SimpleStepHandler) - Executing step: [fetchEmailsStep]
23:59:52,257 - DEBUG (org.springframework.batch.core.step.AbstractStep) - Executing: id=778
23:59:52,259 - DEBUG (org.springframework.batch.repeat.support.RepeatTemplate) - Starting repeat context.
23:59:52,259 - DEBUG (org.springframework.batch.repeat.support.RepeatTemplate) - Repeat operation about to start at count=1
23:59:52,259 - DEBUG (org.springframework.batch.core.scope.context.StepContextRepeatCallback) - Preparing chunk execution for StepContext: org.springframework.batch.core.scope.context.StepContext@1be1ee
23:59:52,259 - DEBUG (org.springframework.batch.core.scope.context.StepContextRepeatCallback) - Chunk execution starting: queue size=0
23:59:52,259 - DEBUG (org.springframework.batch.repeat.support.RepeatTemplate) - Starting repeat context.
23:59:52,259 - DEBUG (org.springframework.batch.repeat.support.RepeatTemplate) - Repeat operation about to start at count=1
... 5 second JMS timeout ...
23:59:57,716 - DEBUG (org.springframework.batch.repeat.support.RepeatTemplate) - Repeat is complete according to policy and result value.
23:59:57,716 - DEBUG (org.springframework.batch.core.step.item.ChunkOrientedTasklet) - Inputs not busy, ended: true
23:59:57,716 - DEBUG (org.springframework.batch.core.step.tasklet.TaskletStep) - Applying contribution: [StepContribution: read=0, written=0, filtered=0, readSkips=0, writeSkips=0, processSkips=0, exitStatus=EXECUTING]
23:59:57,719 - DEBUG (org.springframework.batch.core.step.tasklet.TaskletStep) - Saving step execution before commit: StepExecution: id=778, version=1, name=fetchEmailsStep, status=STARTED, exitStatus=EXECUTING, readCount=0, filterCount=0, writeCount=0 readSkipCount=0, writeSkipCount=0, processSkipCount=0, commitCount=1, rollbackCount=0, exitDescription=
23:59:57,721 - DEBUG (org.springframework.batch.repeat.support.RepeatTemplate) - Repeat is complete according to policy and result value.
23:59:57,721 - DEBUG (org.springframework.batch.core.step.AbstractStep) - Step execution success: id=778
23:59:57,722 - DEBUG (org.springframework.batch.core.step.AbstractStep) - Step execution complete: StepExecution: id=778, version=3, name=fetchEmailsStep, status=COMPLETED, exitStatus=COMPLETED, readCount=0, filterCount=0, writeCount=0 readSkipCount=0, writeSkipCount=0, processSkipCount=0, commitCount=1, rollbackCount=0
23:59:57,723 - DEBUG (org.springframework.batch.core.job.flow.support.SimpleFlow) - Completed state=fetchEmailsJobBatch.fetchEmailsStep with status=COMPLETED
23:59:57,723 - DEBUG (org.springframework.batch.core.job.flow.support.SimpleFlow) - Handling state=fetchEmailsJobBatch.limitDecision100
23:59:57,723 - DEBUG (org.springframework.batch.core.job.flow.support.SimpleFlow) - Completed state=fetchEmailsJobBatch.limitDecision100 with status=CONTINUE
23:59:57,723 - DEBUG (org.springframework.batch.core.job.flow.support.SimpleFlow) - Handling state=fetchEmailsJobBatch.fetchEmailsStep
Problem is, the heap dump shows 748 JobExecution
s, and 581,911 StepExecution
s. Where are they all coming from?
Upvotes: 2
Views: 6065
Reputation: 5113
OK, in answer to my question, the solution was to drop the Step Flow and make "fetchEmailsStep" run forever.
We achieved that by removing our JmsTemplate
("batchEmailJmsTemplate") 's "readTimeout", and subclassing org.springframework.batch.item.jms.JmsItemReader
to remove the check it does to stop you using infinite JmsTemplate
s - and add extra logging. If you don't want the override, you could set the "readTimeout" to a huge number.
More discussion here: http://forum.springsource.org/showthread.php?132468-Massive-memory-use-with-FlowJob-repeating-Steps&p=431279
I still don't know why the memory usage was quite as huge as it was...
[Edit] I was also using a JobExecutionDecider
("our.special.InfiniteThrottledExecutionDecider") to achieve throttling (based on the <StepExecution>.writeCount
), but this had to go. We're now achieving the same result using a new class that extends StepExecutionListenerSupport
(or you can implement StepExecutionListener
). An instance of this bean is added as a <batch:listener>
to our single <batch:step>
Upvotes: 1