sjaak
sjaak

Reputation: 684

pkill isn't fully killing process?

I'm doing some testing with flock and pkill for a test.sh script that I'm calling from cron and I ran into something I don't understand.

The test.sh is scheduled as a * * * * * job in cron. Its a very simple script that for testing purposes writes a timestamp to file and then sleeps for 5 minutes. This is to confirm flock is working well and preventing multiple processes for the same script.

This part is working well as I only see one timestamp showing up per 5 minutes despite the test.sh being scheduled to run every minute.

Now as a extra safety measure I want to kill the test.sh (because the script I actually want to use sometimes appears to hang syncing some files to S3 using AWS CLI)

So I figured pkill would be the easiest as it doesn't require modifying anything to my existing script.

If I run pkill -9 -f test.sh it says the processes is killed. Running ps aux | grep test.sh I indeed don't see any test.sh processes anymore.

However as cron is supposed to test.sh every minute, I expect that after killing the process, it would start again after less than a minute.

However it appears that the script doesn't actually restart until the sleep period is over.

So the script initially runs at e.g. 12:00, sleep will last until 12:05. If I kill the script on 12:02 I expect it to run again at 12:03 but it's not actually running again until 12:05 which is inline with the sleep period.

Why is this happening? Also, if pkill is not recommended, is there any other way to kill my processes after a certain amount of time? Preferably without having to edit the original script.

Upvotes: 0

Views: 1629

Answers (1)

pynexj
pynexj

Reputation: 20798

See the following example:

 1  exec 9> /tmp/flock.tmp
 2  if ! flock -n 9; then
 3      echo "locked by others!"
 4      exit 1
 5  fi
 6
 7  sleep 300

Line 1 opens FD 9 on the lockfile. Line 2's flock sets a lock on the FD. Line 7's sleep inherits the FD and keeps it being locked. When you pkill the .sh script it'll not kill sleep so the FD is still locked until sleep finishes. So, to clean up, you need to kill all running processes after flock.


flock(1) uses flock(2) and according to flock(2):

Locks created by flock() are associated with an open file description (see open(2)). This means that duplicate file descriptors (created by, for example, fork(2) or dup(2)) refer to the same lock, and this lock may be modified or released using any of these file descriptors. Furthermore, the lock is released either by an explicit LOCK_UN operation on any of these duplicate file descriptors, or when all such file descriptors have been closed.

Upvotes: 1

Related Questions