Reputation: 11
I have a Jenkins job which is currently used for rebooting a host, this is a part of a pipeline and has several downstream jobs. Currently the job is rebooting and sleeping before starting the downstream build. Is there a better way within the job to check if the host is back up before continuing instead of using sleep?
Reboot_host job is currently executing:
ssh <hostname> "sudo reboot"
sleep 90
The host is a VM which is why the sleep duration is so short.
Upvotes: 1
Views: 4682
Reputation: 5149
Assuming you're using Pipeline I can tell what we do for our Windows machines where we use a job to install Windows Updates and trigger a reboot.
The whole process involves several steps and a lot of error checking. As the code requires accessing the Jenkins API we put it into a global shared library where all the calls to the Jenkins API are encapsulated in @NonCPS
methods. The following is just a rough sketch of what needs to be done - putting the full code here would be way too much.
To trigger a reboot on a linux machine you may not need all steps. But it should not harm to use them, though. Of course you have to implement proper error checking as well. I'd put the code in some library which can also be unit-tested.
Computer.countBusy()
(for heavy-weight executors) and Computer.getOneOffExecutors()
(for flyweight executors) in some loop from within a node block where you put the node offline (Computer.setTemporarilyOffline()
) before polling. The computer you can get using getContext(hudson.FilePath).toComputer()
. Once there's exactly one heavy-weight executor in use (that's us) and no flyweight executor the node is ready for reboot. Keep the node offline and stay within the node block for the second step.node
block opened in the first step while the Computer
is still offline. Make sure to leave the node block immediately after running the reboot cmd. E.g. on Windows: bat 'shutdown /t 2 /r'
FilePath
for that Computer
: Computer.getNode().createPath()
Computer.disconnect()
. At least for Windows machines this is very important as Jenkins sometimes won't notice that it lost the connection and would try to use the old connection - which would fail.Computer.connect(false)
. Wait until the connection got re-established. We check Computer.getChannel() != null
Computer.setTemporarilyOffline(true, 'foo')
Computer.waitUntilOnline()
Upvotes: 1
Reputation: 1699
I'm assuming you're using a Jenkinsfile
here since you said "pipeline"; if not, please provide a bit more info on your job (freestyle with an execute shell, etc).
You're probably going to need sleep
involved, but you can use it in conjunction with retry
to give you faster success (and faster failure). Assuming you just need the VM to be up, you could use something like:
retry(20){
sleep time: 5 unit: 'SECONDS'
sh 'ssh -o ConnectTimeout=1 <hostname> exit'
}
This will try to ssh to the host every 5 seconds. Adding the ConnectTimeout means ssh
will only wait for 1 second for the connection to complete. exit
just ensures a successful connection is disconnected. retry
will evaluate the commands up to 20 times until the sh
command has a 0
(success) exit value. If it runs 20 times without succeeding, the build will fail (which is probably good, since that means your VM isn't available for the downstream jobs).
If there's a specific service you're waiting for, you could curl
or otherwise make an attempt to contact that service instead of using ssh
.
Upvotes: 0