eak12913
eak12913

Reputation: 303

running script through ssh fails when locally same script succeeds

I'm experiencing a very strange behavior. I have found what appears to be a work around, but I am hoping that someone can explain to me WHY I'm seeing this crazy behavior.

Highlevel of what I'm doing: I'd like to have a shell script to stop my process. I'd like it to be robust enough to kill one or more instances of the process I'm grepping for. I don't want it to fail if there's NO process running (meaning I want a 0 return code...not an empty arg list passed to the kill command)

What I'm seeing is that a script is behaving differently when invoked by passing a command through ssh than if that same script was executed locally. What is very strange is that by adding a seemingly arbitrary command to my ssh command, I'm able to get my script to execute properly and I DONT KNOW WHY!

The stop scipt (echo statments were there to help me debug - not part of real script)

echo "Stopping myProcess" echo "-->ps aux | grep myProcess | grep -v grep" pid=ps -ef | grep myProcess | grep -v grep | awk '{ print $2 }' echo "Here: ${pid}" if [[ ! -z $pid ]]; then echo "Here2" kill -9 $pid else echo "Here3" echo "not stopping anything - no myProcess process running." fi echo "Here4" exit 0

Result of local execution of script when NO processes is running:

Stopping myProcess --> Here: Here3 not stopping anything - no myProcess running. Here4

Result of execution of script from a different machine though the following command:

Command:

ssh eak0703@myServer 'source ${HOME}/.bash_profile;cd /usr/local/myprocess/bin/;./stop-myProcess'

Result:

Stopping myProcess --> eak0703 2099 0.0 0.0 10728 1500 ? Ss 17:08 0:00 bash -c source ${HOME}/.bash_profile;cd /usr/local/myProcess/bin/;./stop-myProcess eak0703 2100 0.0 0.0 10740 992 ? S 17:08 0:00 bash -c source ${HOME}/.bash_profile;cd /usr/local/myProcess/bin/;./stop-myProcess eak0703 2101 0.0 0.0 10740 668 ? S 17:08 0:00 bash -c source ${HOME}/.bash_profile;cd /usr/local/myProcess/bin/;./stop-myProcess Here: 2099 2100 2105 Here2

Notice: for some strange and unexplained to me reason there appear to be 3 invocations of my command. I also know that this command doesn't terminate with an exit code of 0. I am assuming this is because by the time the kill -9 is invoked, the process ids picked up by the grep are gone.

Now - here's the SAME ssh command with an extra "date | grep crap" thrown in:

Command:

ssh eak0703@myServer 'source ${HOME}/.bash_profile;cd /usr/local/myprocess/bin/;date | grep crap;./stop-myProcess'

Result:

Stopping myProcess --> Here: Here3 not stopping anything - no myProcess running. Here4

Putting "date | grep crap" fixes things. It appears that the magic is in the "|" (pipe) operator. So I am actually able to make this work with "anycommand | anyothercommand".

I can make it work - but how can I justify randomly leaving such a nugget in a bash script??? No one will ever know why this is there. Not even me! If anyone has encountered this please help!

Upvotes: 0

Views: 811

Answers (1)

that other guy
that other guy

Reputation: 123410

Parsing ps to find a process is fragile and error prone. Your example is a nice illustration why:

An unrelated process (the bash process started by ssh) contains the process name as part of the command line, and is accidentally picked up by your ps parser.

The unrelated process is removed by your grep -v grep when you make the command line include the word "grep".

Instead, just use pgrep or pkill. These tools list/kill processes based on the executable name and are therefore far more robust than parsing ps.

Upvotes: 2

Related Questions