Reputation: 751
I have a heavily parallelized build across 45 slaves (one master that just handles launches).
The problem I am running into is that about 3% of the jobs disappear.
The project setup is a "master" job that then launches (via the parameterized job plugin) N jobs across N slaves. Most of the time, the console output for the master job is correct with regards to job numbers of the distributed build steps.
Occasionally, however, the job indicated in console actually belongs to a completely different build.
Where do I even start looking to track this down? The jenkins logs are eerily empty of any information about failed jobs or problems launching jobs.
My best guess at the moment is that the missing jobs were actually queued waiting for executors when something happened to remove them. But I have no evidence to support this.
Thoughts, suggestions, helpful links all greatly appreciated,
Upvotes: 5
Views: 2837
Reputation: 597
As long as the bugs https://issues.jenkins-ci.org/browse/JENKINS-15156 and its linked ones are open, it will happen in certain cases. It does not matter what you use for parallel building or dependant building... it is just core problem. Leave it or Live it.
I doubt additional logging is a fix or answer to your problem.
My answer would be - debug and send patches to devs.
Upvotes: 0
Reputation: 16615
Here's how you can get more info: http://[jenkins_server]/log/
-> Add new log recorder -> enter a name of your choice -> OK -> Add -> enter hudson.model.Run
as Logger -> set Log Level to all -> Save.
Now http://[jenkins_server]/log/[your log name]/
will provide you with more info as far as running your jobs is concerned.
Upvotes: 6