Wayne
Wayne

Reputation: 2989

simple timeout on I/O for command for linux

First the background to this intriguing challenge. The continuous integration build can often have failures during development and testing of deadlocks, loops, or other issues that result in a never ending test. So all the mechanisms for notifying that a build has failed become useless.

The solution will be to have the build script timeout if there's zero output to the build log file for more than 5 minutes since the build routinely writes out the names of unit tests as it proceeds. So that's the best way to identify it's "frozen".

Okay. Now the nitty gritty...

The build server uses Hudson to run a simple bash script that invokes the more complex build script based on Nant and MSBuild (all on Windows).

So far all solutions around the net involve a timeout on the total run time of the command. But that solution fails in this case because the tests might hang or freeze in the first 5 minutes.

What we've thought of so far:

First, here's the high level bash command run the full test suite in Hudson.

build.sh clean free test

That command simply sends all the Nant and MSBuild build logging to stdout.

It's obvious that we need to tee that output to a file:

build.sh clean free test 2>&1 | tee build.out

Then in parallel a command needs to sleep, check the modify time of the file and if more than 5 minutes kill the main process. A kill -9 will be fine at that point--nothing graceful needed once it has frozen.

That's the part you can help with.

In fact, I made a script like this over 15 years ago to kill the connection with a data phone line to japan after periods of inactivity but can't remember how I did it.

Sincerely, Wayne

Upvotes: 0

Views: 1816

Answers (3)

Wayne
Wayne

Reputation: 2989

Solved this myself by writing a bash script.

It's called iotimeout with one parameter which is the number of seconds.

You use it like this:

build.sh clean dev test | iotimeout 120

iotimeout has 2 loops.

One is a simple while read line loop that echos echo line but it also uses the touch command to update the modified time of a tmp file every time it writes a line. Unfortunately, it wasn't possible to monitor a build.out file because Windoze doesn't update the file modified time until you close the file. Oh well.

Another loop runs in the background, that's a forever loop which sleeps 10 seconds and then checks the modified time of the temp file. If that ever exceeds 120 seconds old then that loop forces the entire process group to exit.

The only tricky stuff was returning the exit code of the original program. Bash gives you a PIPESTATUS array to solve that.

Also, figuring out how to kill the entire program group was some research but turns out to be easy just--kill 0

Upvotes: 0

Dennis Williamson
Dennis Williamson

Reputation: 360065

You may be able to use timeout:

timeout 300 command

Upvotes: 0

Maxim Egorushkin
Maxim Egorushkin

Reputation: 136256

build.sh clean free test 2>&1 | tee build.out &
sleep 300
kill -KILL %1

Upvotes: 1

Related Questions