Martin Wong
Martin Wong

Reputation: 21

Jenkins xvnc plugin, some display numbers stay allocated when a build is stopped aburptly (ie jenkins restart) and cannot be used again

I am using Jenkins with the Xvnc plugin to run acceptance tests on Firefox in a CentOS slave . I have limited the display numbers to 2-4 since there will be at most 3 instances of testing that need a display. The tests and plugin work fine until Jenkins had to be restarted a few times due to issues in other builds. The following error now occurs whenever the build tries to run:

FATAL: All available display numbers are allocated or blacklisted.
allocated: [2, 3, 4]
blacklisted: []
java.lang.RuntimeException: All available display numbers are allocated or blacklisted.
allocated: [2, 3, 4]
blacklisted: []
    at hudson.plugins.xvnc.DisplayAllocator.doAllocate(DisplayAllocator.java:59)
    at hudson.plugins.xvnc.DisplayAllocator.allocate(DisplayAllocator.java:49)
    at hudson.plugins.xvnc.Xvnc.doSetUp(Xvnc.java:99)
    at hudson.plugins.xvnc.Xvnc.setUp(Xvnc.java:89)
    at jenkins.tasks.SimpleBuildWrapper.setUp(SimpleBuildWrapper.java:146)
    at hudson.model.Build$BuildExecution.doRun(Build.java:156)
    at hudson.model.AbstractBuild$AbstractBuildExecution.run(AbstractBuild.java:537)
    at hudson.model.Run.execute(Run.java:1741)
    at hudson.model.FreeStyleBuild.run(FreeStyleBuild.java:43)
    at hudson.model.ResourceController.execute(ResourceController.java:98)
    at hudson.model.Executor.run(Executor.java:381)

I checked a working build where I restarted Jenkins without manually stopping each job and found potential cause:

Terminating xvnc.
FATAL: hudson.remoting.Channel$OrderlyShutdown
hudson.remoting.RequestAbortedException: hudson.remoting.Channel$OrderlyShutdown
    at hudson.remoting.Request.abort(Request.java:296)
    at hudson.remoting.Channel.terminate(Channel.java:815)
    at hudson.remoting.Channel$CloseCommand.execute(Channel.java:1034)
    at hudson.remoting.Channel$2.handle(Channel.java:484)
    at hudson.remoting.AbstractByteArrayCommandTransport$1.handle(AbstractByteArrayCommandTransport.java:61)
    at org.jenkinsci.remoting.nio.NioChannelHub$2.run(NioChannelHub.java:594)
    at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471)
    at java.util.concurrent.FutureTask.run(FutureTask.java:262)
    at hudson.remoting.SingleLaneExecutorService$1.run(SingleLaneExecutorService.java:112)
    at jenkins.util.ContextResettingExecutorService$1.run(ContextResettingExecutorService.java:28)
    at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471)
    at java.util.concurrent.FutureTask.run(FutureTask.java:262)
    at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
    at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
    at java.lang.Thread.run(Thread.java:745)
    at ......remote call to jenkinstest.build.thoughtwire.com.test(Native Method)
    at hudson.remoting.Channel.attachCallSiteStackTrace(Channel.java:1361)
    at hudson.remoting.Request.call(Request.java:171)
    at hudson.remoting.Channel.call(Channel.java:752)
    at hudson.Launcher$RemoteLauncher.kill(Launcher.java:954)
    at hudson.plugins.xvnc.Xvnc$DisposerImpl.tearDown(Xvnc.java:183)
    at jenkins.tasks.SimpleBuildWrapper$EnvironmentWrapper.tearDown(SimpleBuildWrapper.java:175)
    at hudson.model.Build$BuildExecution.doRun(Build.java:173)
    at hudson.model.AbstractBuild$AbstractBuildExecution.run(AbstractBuild.java:537)
    at hudson.model.Run.execute(Run.java:1741)
    at hudson.model.FreeStyleBuild.run(FreeStyleBuild.java:43)
    at hudson.model.ResourceController.execute(ResourceController.java:98)
    at hudson.model.Executor.run(Executor.java:381)
Caused by: hudson.remoting.Channel$OrderlyShutdown
    at hudson.remoting.Channel$CloseCommand.execute(Channel.java:1034)
    at hudson.remoting.Channel$2.handle(Channel.java:484)
    at hudson.remoting.AbstractByteArrayCommandTransport$1.handle(AbstractByteArrayCommandTransport.java:61)
    at org.jenkinsci.remoting.nio.NioChannelHub$2.run(NioChannelHub.java:594)
    at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471)
    at java.util.concurrent.FutureTask.run(FutureTask.java:262)
    at hudson.remoting.SingleLaneExecutorService$1.run(SingleLaneExecutorService.java:112)
    at jenkins.util.ContextResettingExecutorService$1.run(ContextResettingExecutorService.java:28)
    at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471)
    at java.util.concurrent.FutureTask.run(FutureTask.java:262)
    at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
    at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
    at java.lang.Thread.run(Thread.java:745)
Caused by: Command close created at
    at hudson.remoting.Command.<init>(Command.java:56)
    at hudson.remoting.Channel$CloseCommand.<init>(Channel.java:1028)
    at hudson.remoting.Channel$CloseCommand.<init>(Channel.java:1026)
    at hudson.remoting.Channel.close(Channel.java:1109)
    at hudson.remoting.Channel.close(Channel.java:1092)
    at hudson.remoting.Channel$CloseCommand.execute(Channel.java:1033)
    at hudson.remoting.Channel$2.handle(Channel.java:484)
    at hudson.remoting.SynchronousCommandTransport$ReaderThread.run(SynchronousCommandTransport.java:60)

It seems like the job did not close properly and the Xvnc plugin did not get a chance to deallocate the display. I made sure the processes and tests in the slave are properly terminated and nothing is running.

The core issue here is that display numbers 2, 3, and 4 are now permanently allocated and cannot be reused even though no builds are running. If the slave (TEST) is mirrored (TEST2) then TEST2 can use display 2, 3, and 4 but TEST cannot. I have tried reinstalling the plugin but the numbers stay allocated and linked to TEST.

Does anyone know of a way to clear the list of allocated display numbers? Is this a bug with the plugin? Is there a way to prevent display numbers from staying allocated if say Jenkins suddenly dies while jobs are running?

Upvotes: 2

Views: 1674

Answers (2)

Stijn Diependaele
Stijn Diependaele

Reputation: 588

This is a groovy script I created to clean up the Xvnc display numbers without stopping jenkins. But it might also clean up numbers of still running jobs.

https://github.com/sdiepend/jenkins-monitoring/blob/master/cleanXvncDisplayNumbers.groovy

import jenkins.*
import jenkins.model.Jenkins

Jenkins jenkins = Jenkins.getActiveInstance();
xvncDescriptor = jenkins.getDescriptorByType(hudson.plugins.xvnc.Xvnc.DescriptorImpl.class)

xvncDescriptor.allocators.each {
  allocator = it.value
  // collect is used to make sure numAlloc is an entire new list and not just a reference to the same list object, otherwise you'll get a
  // concurrentmodification exception
  numAlloc = allocator.allocatedNumbers.collect()

  numAlloc.each {
    allocator.allocatedNumbers.remove(it)
  }
}

Upvotes: 0

Zhiyu
Zhiyu

Reputation: 11

The allocated display number is saved in hudson.plugins.xvnc.Xvnc.xml file on the jenkins master (under jenkins home directory). To clear the numbers, you need to stop jenkins, clean up <allocatedNumbers> in that xml, and start jenkins server again.

It is important to edit the file after you stop jenkins server, since jenkins will save the current numbers when it stops.

Upvotes: 1

Related Questions