Reputation: 1141
It may look menacing, but the task is really simple:
server has the following directory structure:
/usr/multi
/1
job
file.a
file.b
/2
job
file.a
file.b
/3
job
file.a
file.b
And the following code:
#this is thread.sh
cd /usr/multi
#find the first directory that has a job file
id=$(ls */job)
#strip everything after "/" ("1/job" becomes "1")
id=${id%%/*}
#read job
read job <$id/job
if [ "$id" == "" ] || [ "$job" == "" ]
then
false
else
#mark that id as busy
mv $id/job $id/_job
#execute the job
script.sh $1 $job
#mark that id as available
mv $id/_job $id/job
fi
script.sh
performs some operations (described in job
file and received argument) on file.a
and file.b
.
Clients, on the other hand, execute this code:
#loop infinitely on failure, break loop on success
false
while [ "$?" != "0" ]
do
result=$(ssh $server "thread.sh 'some instructions'" </dev/null)
done
echo $result
So every client gets a separate id
and gives the server some instructions
to perform the job
specified for that id
. If there are more clients than available jobs on the server, clients will keep trying to grab the first available id
and when they do, give some instructions
to the server to perform corresponding job
.
The problem is that every once in a while two clients get the same id
; and thread.sh
messes up the file.a
and file.b
.
My theory is that this happens when two clients request an id
from the server at almost the same time, so that server cannot rename the job
file quick enough for one client to see it as available, and the other one to see it as busy.
Should I put random sleep interval just before the if [ "$id" == "" ] || [ "$job" == "" ]
so I will get some more randomness in the timing?
Upvotes: 1
Views: 306
Reputation: 8223
As you have correctly determined, your script is quite racy.
A simple locking in bash can be implemented using
set -o noclobber
and writing to a lockfile. If the lock is already being held (the file exists), your write attempt will fail in an atomic manner.
Upvotes: 1