J Fabian Meier
J Fabian Meier

Reputation: 35843

Handle technical error in Jenkins Step (Jenkins pipeline)

A step calls a shell script (e.g. to install an artifact on a test server) and fails for technical reasons (network does not work, database broke down, etc.). Possible reactions:

  1. Fail the whole job. Not a good idea because then one would need to repeat the previous steps for no good reason (these steps can include manual ones).

  2. Retry the step after e.g. a 5 minute wait. Could solve the problem sometimes, but could also lead to an infinite loop if the problem was caused by misconfiguration.

  3. Ask the user if to proceed or abort. Most flexible approach but needs (unexpected) use interaction.

Has Jenkins a standard solution for this?

Upvotes: 1

Views: 316

Answers (1)

Wimateeka
Wimateeka

Reputation: 2706

Fail the whole job. Not a good idea because then one would need to repeat the previous steps for no good reason (these steps can include manual ones).

It depends on how you rank the severity of the problem. I personally would say if you have any kind of build issues, log it, fail and send notification to whoever needs to know. But if you feel that's overkill, you might want to try something else.

Retry the step after e.g. a 5 minute wait. Could solve the problem sometimes, but could also lead to an infinite loop if the problem was caused by misconfiguration.

You can set a timeout in a Jenkinsfile to give up after, say an hour, or however long you think the job should take. This way you get to retry a few times and you're guaranteed a termination of the job if it gets stuck.

Ask the user if to proceed or abort. Most flexible approach but needs (unexpected) use interaction.

User interaction isn't a great option, especially if you you have no idea when it will happen or if your jobs are scheduled to run between 10pm-5am.

Of all the options, it sounds like the best for your may be the second; retry a few times, and if that doesn't work then log what happens, fail, and notify someone. That should cover your sporadic technical issues. Also if retrying doesn't work, it can show that there are bigger problems to fix.

Upvotes: 2

Related Questions