Reputation: 13599
Found an interesting interaction between pkill
and ssh
. Documenting it here for posterity:
$ ssh user@remote 'false'; echo $?
1
$ ssh user@remote 'false || echo "failed"'; echo $?
failed
0
$ ssh user@remote 'pkill -f "fake_process"'; echo $?
1
$ ssh user@remote 'pkill -f "fake_process" || echo "failed"'; echo $?
255
It seems like example #4 should have the same output as #2; both false
and pkill -f "fake_process"
exit with code 1
and have no output. However, #4 will always exit with code 255
, even if the remote command explicitly calls exit 0
. The docs for ssh
state that code 255
just means "an error occurred" (super helpful).
Replacing the pkill
command with (exit 1)
, ls fake_file
, kill <non-existent PID>
, etc. all work as expected. Additionally, when running locally (not through ssh
), these match as expected.
Upvotes: 1
Views: 1380
Reputation: 13599
The problem appears to be that pkill
is killing itself. Or rather, it is killing the shell that owns it.
First of all, it appears that ssh
uses the remote user's shell to execute certain "complicated" commands:
$ ssh user@remote 'ps -F --pid $$'
UID PID PPID C SZ RSS PSR STIME TTY TIME CMD
user 9531 9526 0 11862 1616 6 14:36 ? 00:00:00 ps -F --pid 9531
$ ssh user@remote 'ps -F --pid $$ && echo hi'
UID PID PPID C SZ RSS PSR STIME TTY TIME CMD
user 9581 9577 0 28316 1588 5 14:36 ? 00:00:00 bash -c ps -F --pid $$ && echo hi
hi
Second, it appears that pkill -f
normally knows not to kill itself (otherwise all pkill -f
commands would suicide). But if run from a subshell, that logic fails:
$ pkill -f fake_process; echo $?
1
$ sh -c 'pkill -f fake_process'; echo $?
[1] 14031 terminated sh -c 'pkill -f fake_process'
143
In my case, to fix this I just re-worked some of the code around my ssh
/pkill
so that I could avoid having a "complicated" remote command. Theoretically I think you could also do something like pgrep -f <cmd> | grep -v $$ | xargs kill
.
Upvotes: 2