Reputation: 137
UPDATE:
I try to add some details because it's very important for me to solve this problem.
I made a batch which generates pdf documents from data which is present in some tables and saves pdf in a table. The batch is ok but the data to process is huge, so i decided to divide input data in 8 groups and process indipendently the 8 groups with 8 parallel steps. Each step has it's own reader (named "readerX" for the step "X") and has the same processor and writer which is used by the other steps.
Elaboration goes well, but my client says that this batch uses too much memory (he looks at the "Working Set" parameter in perfmon). In particular the batch begins with 300Mb of used memory, then the used memory reaches 7GB, then decreases to 2GB and the batch finish with 1/2GB of allocated memory.
I paste the code of the job here, hoping someone could help me to find the problem (i guess i made some mistake in adapting the job to parallel processing).
I'm new to spring batch so i apologize for the "bad look".
<job id="myJob"
xmlns="http://www.springframework.org/schema/batch">
<step id="step1" next="step2">
<tasklet ref="task1" />
</step>
<step id="step2" next="step3">
<tasklet ref="task2" />
</step>
<step id="step3" next="decider">
<tasklet ref="task3" />
</step>
<decision id="decider" decider="StepExecutionDecider">
<next on="CASE X" to="split1" />
<end on="*"/>
</decision>
<split id="split1" task-executor="taskExecutor" next="endStep">
<flow>
<step id="EXEC1">
<tasklet><chunk reader="reader1" processor="processor" writer="writer" commit-interval="100"/>
<listeners>
<listener ref="Listner" />
</listeners>
</tasklet>
</step>
</flow>
<flow>
<step id="EXEC2">
<tasklet><chunk reader="reader2" processor="processor" writer="writer" commit-interval="100"/>
<listeners>
<listener ref="Listner" />
</listeners>
</tasklet>
</step>
</flow>
<flow>
<step id="EXEC3">
<tasklet><chunk reader="reader3" processor="processor" writer="writer" commit-interval="100"/>
<listeners>
<listener ref="Listner" />
</listeners>
</tasklet>
</step>
</flow>
<flow>
<step id="EXEC4">
<tasklet><chunk reader="reader4" processor="processor" writer="writer" commit-interval="100"/>
<listeners>
<listener ref="Listner" />
</listeners>
</tasklet>
</step>
</flow>
<flow>
<step id="EXEC5">
<tasklet><chunk reader="reader5" processor="processor" writer="writer" commit-interval="100"/>
<listeners>
<listener ref="Listner" />
</listeners>
</tasklet>
</step>
</flow>
<flow>
<step id="EXEC6">
<tasklet><chunk reader="reader6" processor="processor" writer="writer" commit-interval="100"/>
<listeners>
<listener ref="Listner" />
</listeners>
</tasklet>
</step>
</flow>
<flow>
<step id="EXEC7">
<tasklet><chunk reader="reader7" processor="processor" writer="writer" commit-interval="100"/>
<listeners>
<listener ref="Listner" />
</listeners>
</tasklet>
</step>
</flow>
<flow>
<step id="EXEC8">
<tasklet><chunk reader="reader8" processor="processor" writer="writer" commit-interval="100"/>
<listeners>
<listener ref="Listner" />
</listeners>
</tasklet>
</step>
</flow>
</split>
<step id="endStep" next="decider">
<tasklet ref="task4" >
<listeners>
<listener ref="Listner" />
</listeners>
</tasklet>
</step>
</job>
<bean id="taskExecutor" class="org.springframework.core.task.SimpleAsyncTaskExecutor"/>
<bean id="reader1" class="class of the reader">
<property name="idReader" value="1"/> // Different for the 8 readers
<property name="subSet" value="10"/> // Different for the 8 readers
<property name="dao" ref="Dao" />
<property name="bean" ref="Bean" />
[...] // Other beans
</bean>
Thanks
Upvotes: 1
Views: 4690
Reputation: 12059
The batch is ok but the data to process is huge, so i decided to divide input data in 8 groups and process independently the 8 groups with 8 parallel steps.
If you are processing in parallel on the same machine it won't reduce the memory foot print. All the data exists in memory at the same time. If you want to decrease memory use you have to execute the steps one after the other.
Upvotes: 0
Reputation: 137
Using a profiler and optimizing code i successfully limited memory consumption. Thanks to all!!!
Upvotes: 0
Reputation:
If your getting an OOM eventually, first start by looking at the heap.
Start the JVM with -XX:+HeapDumpOnOutOfMemoryError to obtain the HPROF which you can then look at to see object allocation, sizes etc. When the JVM exits with an OOM, this file will be generated (may take some time depending on size).
If your able to run with a larger memory foot print such as your clients machine, take a snapshot of the heap when its consuming a large amount such as the 7GB you mentioned (or any other value considered high - 4, 5, 6 etc). You should be able to invoke this while running via tools such as jconsole that come part of the JDK.
With the HPROF file, you can then inspect that with JDK provided tools such as jhat or a more GUI based tool such as the eclipse memory analyzer. This should give you a good (and relatively easy) way of finding out whats holding on to what and provide a starting point for decreasing footprint.
Upvotes: 2