Reputation: 303
I'm experiencing a very strange behavior. I have found what appears to be a work around, but I am hoping that someone can explain to me WHY I'm seeing this crazy behavior.
Highlevel of what I'm doing: I'd like to have a shell script to stop my process. I'd like it to be robust enough to kill one or more instances of the process I'm grepping for. I don't want it to fail if there's NO process running (meaning I want a 0 return code...not an empty arg list passed to the kill command)
What I'm seeing is that a script is behaving differently when invoked by passing a command through ssh than if that same script was executed locally. What is very strange is that by adding a seemingly arbitrary command to my ssh command, I'm able to get my script to execute properly and I DONT KNOW WHY!
The stop scipt (echo statments were there to help me debug - not part of real script)
echo "Stopping myProcess" echo "-->
ps aux | grep myProcess | grep -v grep" pid=
ps -ef | grep myProcess | grep -v grep | awk '{ print $2 }'echo "Here: ${pid}" if [[ ! -z $pid ]]; then echo "Here2" kill -9 $pid else echo "Here3" echo "not stopping anything - no myProcess process running." fi echo "Here4" exit 0
Result of local execution of script when NO processes is running:
Stopping myProcess --> Here: Here3 not stopping anything - no myProcess running. Here4
Result of execution of script from a different machine though the following command:
Command:
ssh eak0703@myServer 'source ${HOME}/.bash_profile;cd /usr/local/myprocess/bin/;./stop-myProcess'
Result:
Stopping myProcess --> eak0703 2099 0.0 0.0 10728 1500 ? Ss 17:08 0:00 bash -c source ${HOME}/.bash_profile;cd /usr/local/myProcess/bin/;./stop-myProcess eak0703 2100 0.0 0.0 10740 992 ? S 17:08 0:00 bash -c source ${HOME}/.bash_profile;cd /usr/local/myProcess/bin/;./stop-myProcess eak0703 2101 0.0 0.0 10740 668 ? S 17:08 0:00 bash -c source ${HOME}/.bash_profile;cd /usr/local/myProcess/bin/;./stop-myProcess Here: 2099 2100 2105 Here2
Notice: for some strange and unexplained to me reason there appear to be 3 invocations of my command. I also know that this command doesn't terminate with an exit code of 0. I am assuming this is because by the time the kill -9 is invoked, the process ids picked up by the grep are gone.
Now - here's the SAME ssh command with an extra "date | grep crap" thrown in:
Command:
ssh eak0703@myServer 'source ${HOME}/.bash_profile;cd /usr/local/myprocess/bin/;date | grep crap;./stop-myProcess'
Result:
Stopping myProcess --> Here: Here3 not stopping anything - no myProcess running. Here4
Putting "date | grep crap" fixes things. It appears that the magic is in the "|" (pipe) operator. So I am actually able to make this work with "anycommand | anyothercommand".
I can make it work - but how can I justify randomly leaving such a nugget in a bash script??? No one will ever know why this is there. Not even me! If anyone has encountered this please help!
Upvotes: 0
Views: 811
Reputation: 123410
Parsing ps
to find a process is fragile and error prone. Your example is a nice illustration why:
An unrelated process (the bash
process started by ssh
) contains the process name as part of the command line, and is accidentally picked up by your ps
parser.
The unrelated process is removed by your grep -v grep
when you make the command line include the word "grep".
Instead, just use pgrep
or pkill
. These tools list/kill processes based on the executable name and are therefore far more robust than parsing ps
.
Upvotes: 2