Reputation: 352
While trying to test the solution's solvers using a benchmark configuration, I encounter the follow exception :
2021-12-22 15:24:37.328 WARN 22684 --- [ Test worker] c.o.b.i.D.singleBenchmarkRunnerException : The warm up singleBenchmarkRunner (Problem_0_Currently used_0) with random seed (null) failed.
java.lang.OutOfMemoryError: Java heap space
2021-12-22 15:24:37.329 WARN 22684 --- [ Test worker] c.o.b.i.D.singleBenchmarkRunnerException : The warm up singleBenchmarkRunner (Problem_0_Currently used_0) with random seed (null) failed.
java.lang.OutOfMemoryError: Java heap space
2021-12-22 15:24:37.330 WARN 22684 --- [ Test worker] c.o.b.i.D.singleBenchmarkRunnerException : The warm up singleBenchmarkRunner (Problem_0_Currently used_0) with random seed (null) failed.
java.lang.OutOfMemoryError: Java heap space
at java.base/java.lang.Long.valueOf(Long.java:1207)
at myrostering.solver.PEC.LambdaExtractorEC9F24820AB70C5865CE63ED29F967E9.apply(LambdaExtractorEC9F24820AB70C5865CE63ED29F967E9.java:69)
at myrostering.solver.PEC.LambdaExtractorEC9F24820AB70C5865CE63ED29F967E9.apply(LambdaExtractorEC9F24820AB70C5865CE63ED29F967E9.java:1)
at org.drools.model.functions.Function1$Impl.apply(Function1.java:35)
at org.drools.modelcompiler.constraints.LambdaReadAccessor.getValue(LambdaReadAccessor.java:42)
at org.drools.core.rule.Declaration.getValue(Declaration.java:258)
at org.drools.core.rule.Declaration.getValue(Declaration.java:253)
at org.drools.modelcompiler.constraints.BindingEvaluator.getArgument(BindingEvaluator.java:59)
at org.drools.modelcompiler.constraints.ConstraintEvaluator$InnerEvaluator.getArgument(ConstraintEvaluator.java:242)
at org.drools.modelcompiler.constraints.ConstraintEvaluator$InnerEvaluator$_2.evaluate(ConstraintEvaluator.java:309)
at org.drools.modelcompiler.constraints.ConstraintEvaluator.evaluate(ConstraintEvaluator.java:124)
at org.drools.modelcompiler.constraints.LambdaConstraint.isAllowedCachedLeft(LambdaConstraint.java:187)
at org.drools.core.common.SingleBetaConstraints.isAllowedCachedLeft(SingleBetaConstraints.java:132)
at org.drools.core.phreak.PhreakAccumulateNode.doLeftInserts(PhreakAccumulateNode.java:178)
at org.drools.core.phreak.PhreakAccumulateNode.doNode(PhreakAccumulateNode.java:89)
at org.drools.core.phreak.RuleNetworkEvaluator.switchOnDoBetaNode(RuleNetworkEvaluator.java:591)
at org.drools.core.phreak.RuleNetworkEvaluator.evalBetaNode(RuleNetworkEvaluator.java:558)
at org.drools.core.phreak.RuleNetworkEvaluator.evalNode(RuleNetworkEvaluator.java:385)
at org.drools.core.phreak.RuleNetworkEvaluator.innerEval(RuleNetworkEvaluator.java:345)
at org.drools.core.phreak.RuleNetworkEvaluator.outerEval(RuleNetworkEvaluator.java:181)
at org.drools.core.phreak.RuleNetworkEvaluator.evaluateNetwork(RuleNetworkEvaluator.java:139)
at org.drools.core.phreak.RuleExecutor.reEvaluateNetwork(RuleExecutor.java:235)
at org.drools.core.phreak.RuleExecutor.evaluateNetworkAndFire(RuleExecutor.java:91)
at org.drools.core.concurrent.AbstractRuleEvaluator.internalEvaluateAndFire(AbstractRuleEvaluator.java:33)
at org.drools.core.concurrent.SequentialRuleEvaluator.evaluateAndFire(SequentialRuleEvaluator.java:43)
at org.drools.core.common.DefaultAgenda.fireLoop(DefaultAgenda.java:753)
at org.drools.core.common.DefaultAgenda.internalFireAllRules(DefaultAgenda.java:700)
at org.drools.core.common.DefaultAgenda.fireAllRules(DefaultAgenda.java:692)
at org.drools.core.impl.StatefulKnowledgeSessionImpl.internalFireAllRules(StatefulKnowledgeSessionImpl.java:1225)
at org.drools.core.impl.StatefulKnowledgeSessionImpl.fireAllRules(StatefulKnowledgeSessionImpl.java:1216)
at org.drools.core.impl.StatefulKnowledgeSessionImpl.fireAllRules(StatefulKnowledgeSessionImpl.java:1200)
at org.optaplanner.core.impl.score.director.drools.DroolsScoreDirector.calculateScore(DroolsScoreDirector.java:105)
Here is the test class I ran:
@SpringBootTest(classes = MyApplication.class)
@EnableConfigurationProperties({ApplicationProperties.class, MyRosterProperties.class})
public class SolverBenchmarkTest {
private PlannerBenchmarkFactory benchmarkFactory = PlannerBenchmarkFactory.createFromXmlResource(
"myrostering/benchmark/benchmarkSolverConfig.xml");
@Autowired
MyRosterGenerator myRosterGenerator;
@Test
public void benchmarkBasicRostering() {
MyRoster mr = myRosterGenerator.createMyRoster();
PlannerBenchmark benchmark = benchmarkFactory.buildPlannerBenchmark(mr);
benchmark.benchmarkAndShowReportInBrowser();
}
}
Here is the benchmark configuration file :
<?xml version="1.0" encoding="UTF-8"?>
<plannerBenchmark xmlns="https://www.optaplanner.org/xsd/benchmark" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
xsi:schemaLocation="https://www.optaplanner.org/xsd/benchmark https://www.optaplanner.org/xsd/benchmark/benchmark.xsd">
<benchmarkDirectory>local/benchmark/data/my-roster</benchmarkDirectory>
<parallelBenchmarkCount>AUTO</parallelBenchmarkCount>
<warmUpSecondsSpentLimit>30</warmUpSecondsSpentLimit>
<inheritedSolverBenchmark>
<solver>
<!-- This part of the solver configuration must be the same as the one used by the planner, otherwise, the benchmark test is pointless -->
<moveThreadCount>4</moveThreadCount>
<solutionClass>myrostering.domain.MyRoster</solutionClass>
<entityClass>myrostering.domain.Assignment</entityClass>
<scoreDirectorFactory>
<scoreDrl>myrostering/solver/myRosteringScoreRules.drl</scoreDrl>
</scoreDirectorFactory>
<termination>
<!-- Adding this secondsSpentLimit (contrary to no limit set for the planner) to avoid the benchmark running for too long -->
<secondsSpentLimit>60</secondsSpentLimit>
<bestScoreLimit>0hard/0medium/0soft</bestScoreLimit>
</termination>
<constructionHeuristic>
<constructionHeuristicType>STRONGEST_FIT</constructionHeuristicType>
</constructionHeuristic>
</solver>
</inheritedSolverBenchmark>
<solverBenchmark>
<name>Currently used</name>
<solver>
<localSearch>
<unionMoveSelector>
<moveListFactory>
<cacheType>PHASE</cacheType>
<moveListFactoryClass>
myrostering.solver.move.factory.ChangeMoveFactory
</moveListFactoryClass>
</moveListFactory>
<moveListFactory>
<cacheType>PHASE</cacheType>
<moveListFactoryClass>
myrostering.solver.move.factory.SwapMoveFactory
</moveListFactoryClass>
</moveListFactory>
</unionMoveSelector>
<acceptor>
<entityTabuSize>5</entityTabuSize>
<simulatedAnnealingStartingTemperature>15000hard/10medium/1000soft</simulatedAnnealingStartingTemperature>
</acceptor>
<forager>
<acceptedCountLimit>4</acceptedCountLimit>
</forager>
</localSearch>
</solver>
</solverBenchmark>
</plannerBenchmark>
Also, I'd like to add that we run the solver.solve() without an issue - even if the dataset is quite large (150 to 300 Mo for the file containing the solution when serialized). So I'm a bit surprised when the benchmark fails on warm up...
EDIT:
I've changed the configuration for these two parameters :
<parallelBenchmarkCount>1</parallelBenchmarkCount>
...
<secondsSpentLimit>600</secondsSpentLimit>
But I still got the following exception :
2022-01-03 10:53:49.850 INFO 21696 --- [nchmarkThread-1] o.d.c.kie.builder.impl.KieContainerImpl : Start creation of KieBase: defaultKieBase
2022-01-03 10:53:49.909 INFO 21696 --- [nchmarkThread-1] o.d.c.kie.builder.impl.KieContainerImpl : End creation of KieBase: defaultKieBase
2022-01-03 10:54:32.585 INFO 21696 --- [nchmarkThread-1] o.o.core.impl.solver.DefaultSolver : Solving started: time spent (41506), best score (-38295462hard/38260medium/3640soft), environment mode (REPRODUCIBLE), move thread count (4), random (JDK with seed 0).
2022-01-03 10:54:33.611 ERROR 21696 --- [nchmarkThread-1] o.o.core.impl.solver.thread.ThreadUtils : Multithreaded Local Search's ExecutorService didn't terminate within timeout (1 seconds).
2022-01-03 10:54:33.611 INFO 21696 --- [nchmarkThread-1] o.o.c.i.h.thread.MoveThreadRunner : Score calculation speed will be too low because move thread (0)'s destroy wasn't processed soon enough.
2022-01-03 10:54:33.611 INFO 21696 --- [nchmarkThread-1] o.o.c.i.h.thread.MoveThreadRunner : Score calculation speed will be too low because move thread (1)'s destroy wasn't processed soon enough.
2022-01-03 10:54:33.611 INFO 21696 --- [nchmarkThread-1] o.o.c.i.h.thread.MoveThreadRunner : Score calculation speed will be too low because move thread (2)'s destroy wasn't processed soon enough.
2022-01-03 10:54:33.611 INFO 21696 --- [nchmarkThread-1] o.o.c.i.h.thread.MoveThreadRunner : Score calculation speed will be too low because move thread (3)'s destroy wasn't processed soon enough.
2022-01-03 10:54:33.612 INFO 21696 --- [nchmarkThread-1] .c.i.c.DefaultConstructionHeuristicPhase : Construction Heuristic phase (0) ended: time spent (42533), best score (-38295462hard/38260medium/3640soft), score calculation speed (0/sec), step total (0).
2022-01-03 10:56:00.115 WARN 21696 --- [ Test worker] c.o.b.i.D.singleBenchmarkRunnerException : The subSingleBenchmarkRunner (Problem_0_Currently used_0) with random seed (null) failed.
java.lang.OutOfMemoryError: Java heap space
at java.base/java.util.Arrays.copyOf(Arrays.java:3480)
at java.base/java.util.ArrayList.grow(ArrayList.java:237)
at java.base/java.util.ArrayList.grow(ArrayList.java:244)
at java.base/java.util.ArrayList.add(ArrayList.java:454)
at java.base/java.util.ArrayList.add(ArrayList.java:467)
at be.myrostering.solver.move.factory.MySwapMoveFactory.createMoveList(MySwapMoveFactory.java:50)
at be.myrostering.solver.move.factory.MySwapMoveFactory.createMoveList(MySwapMoveFactory.java:30)
at org.optaplanner.core.impl.heuristic.selector.move.factory.MoveListFactoryToMoveSelectorBridge.constructCache(MoveListFactoryToMoveSelectorBridge.java:72)
at org.optaplanner.core.impl.heuristic.selector.common.SelectionCacheLifecycleBridge.phaseStarted(SelectionCacheLifecycleBridge.java:51)
at org.optaplanner.core.impl.phase.event.PhaseLifecycleSupport.firePhaseStarted(PhaseLifecycleSupport.java:37)
at org.optaplanner.core.impl.heuristic.selector.AbstractSelector.phaseStarted(AbstractSelector.java:50)
at org.optaplanner.core.impl.phase.event.PhaseLifecycleSupport.firePhaseStarted(PhaseLifecycleSupport.java:37)
at org.optaplanner.core.impl.heuristic.selector.AbstractSelector.phaseStarted(AbstractSelector.java:50)
at org.optaplanner.core.impl.localsearch.decider.LocalSearchDecider.phaseStarted(LocalSearchDecider.java:94)
at org.optaplanner.core.impl.localsearch.decider.MultiThreadedLocalSearchDecider.phaseStarted(MultiThreadedLocalSearchDecider.java:92)
at org.optaplanner.core.impl.localsearch.DefaultLocalSearchPhase.phaseStarted(DefaultLocalSearchPhase.java:141)
at org.optaplanner.core.impl.localsearch.DefaultLocalSearchPhase.solve(DefaultLocalSearchPhase.java:82)
at org.optaplanner.core.impl.solver.AbstractSolver.runPhases(AbstractSolver.java:99)
at org.optaplanner.core.impl.solver.DefaultSolver.solve(DefaultSolver.java:192)
at org.optaplanner.benchmark.impl.SubSingleBenchmarkRunner.call(SubSingleBenchmarkRunner.java:122)
at org.optaplanner.benchmark.impl.SubSingleBenchmarkRunner.call(SubSingleBenchmarkRunner.java:42)
at java.base/java.util.concurrent.FutureTask.run(FutureTask.java:264)
at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1130)
at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:630)
at java.base/java.lang.Thread.run(Thread.java:832)
2022-01-03 10:56:00.603 INFO 21696 --- [ Test worker] o.o.b.impl.report.BenchmarkReport : Generating benchmark report...
VERSION_2_3_31
java.lang.NoSuchFieldError: VERSION_2_3_31
at org.optaplanner.benchmark.impl.report.BenchmarkReport.writeHtmlOverviewFile(BenchmarkReport.java:828)
at org.optaplanner.benchmark.impl.report.BenchmarkReport.writeReport(BenchmarkReport.java:318)
at org.optaplanner.benchmark.impl.DefaultPlannerBenchmark.benchmarkingEnded(DefaultPlannerBenchmark.java:311)
at org.optaplanner.benchmark.impl.DefaultPlannerBenchmark.benchmark(DefaultPlannerBenchmark.java:100)
at org.optaplanner.benchmark.impl.DefaultPlannerBenchmark.benchmarkAndShowReportInBrowser(DefaultPlannerBenchmark.java:424)
On a final note, it appears that the problem might not be linked to Optaplanner (because the out-of-memory is triggered in MySwapMoveFactory) - if so, I'll close this post. But it would be still odd that it works when running the solver but not the benchmark...
Upvotes: 0
Views: 159
Reputation: 27357
MoveListFactory
scales badly, consuming a lot of memory and cpu.
Some kind of moves have billions of possible moves. For example a 3 swap on 10000 shifts has 1 trillion moves. That doesn't fit into a few GB RAM and it takes ages to generate.
Use a MoveIteratorFactory
instead and don't generate a list of moves, but generate them Just In Time, just like the default selectors do. See docs.
Upvotes: 1
Reputation: 27357
Increase memory, for example with VM option -Xmx4g
Also note that parallelBenchmarkCount AUTO currently doesn't take into account that moveThreadCount is not NONE. So your benchmarks will not be accurate, because if you have 16 cores, parallelBenchmarkCount AUTO will resolve to 8. With moveThreadCount 4 (+ 1 solver thread), you'll be using 32+ cores but only have 16 cores. This probably should be reported as an issue in optaplanner's jira for parallelBenchmarkCount AUTO.
Upvotes: 1