Reputation: 23
I have a great amount of Linux servers to maintain. Frequently I need to run a script (script.sh) on all of them to get the health status, this script usually takes about 30-40 seconds to give an output. To facilitate maintenance tasks, I'm writing a shell script that uses SSH to loop through all remote hosts, run script.sh, collect output and write it to a log file in my local host. For the sake of this question, I have named this script MyScript.sh
The script works fine, however, it has to wait for the SSH output to continue to the next host. Because I have too many servers, and the commands runs in sequence, it take several minutes to finish. I would like to loop through all servers in parallel, without needing to wait for a response from each host.
Is there a way I can remotely run script.sh simultaneously on all host using MyScript.sh? Maybe run the ssh command in the background and somehow collect the output?
The output of script.sh is a single line separated by pipes. Such as the following
host1|49 days|10%|3.77%|27677/63997 MB|43% - /usr|38% - /usr|Optimal|No|40%|No
The output of Myscript.sh is the concatenation of the output from all host without pipes.
Date Hostname Uptime CPU I/O Free MEM File System INODES STATUS WWW YYY ZZZ XXX
===================================================================================================================================================================================================
01/31/20 host1 44 days 5% 10.33% 38083/64000 MB 57% - / 37% - /usr OPTIMAL No 40% No
01/31/20 host2 45 days 11% 1.79% 27915/63997 MB 43% - /usr 38% - /usr OPTIMAL UP 7% OK
01/31/20 host3 45 days 2% 1.89% 32145/63997 MB 43% - /usr 38% - /usr OPTIMAL UP NO OK
01/31/20 host4 45 days 11% 3.72% 52477/128637 MB 49% - /var 38% - /usr OPTIMAL UP 8% OK
01/31/20 host5 45 days 6% 3.21% 65264/128637 MB 46% - /var 38% - /usr OPTIMAL UP NO OK
01/31/20 host6 45 days 7% 5.79% 56369/63997 MB 43% - /usr 38% - /usr OPTIMAL UP NO No
01/31/20 host7 45 days 6% 1.66% 56391/63997 MB 43% - /var 38% - /usr OPTIMAL UP NO No
The core of MyScript.sh is the following:
(
for ip in $IP_LIST;
do
echo "Checking $ip"
ssh -q -t $user@$ip 'sudo /tmp/script.sh' > /tmp/$$
current_date=$(date +%D)
printf "%-10s " "$current_date" >> $logfile
while read line;
do
echo $line | awk -F '|' '{printf("%-10s %-10s %-7s %-8s %-18s %-25s %-25s %-15s %-15s %-25s %-10s\n",$1,$2,$3,$4,$5,$6,$7,$8,$9,$10,$11); }' >> $logfile
done< /tmp/$$
done
)
In summary, I would like to optimize this script to run the above code simultaneously on multiple servers. Thanks!
Upvotes: 2
Views: 2876
Reputation: 464
The solution could be to deploy a monitoring software with custom checks.
For the parrallel ssh
problem, without install any binaries you could use this script I wrote a while ago.
Put in a file mssh
, run chmod u+x mssh
and then :
./mssh -s SERVER1 -s SERVER2 -C script.sh
The mssh
file :
#!/usr/bin/env bash
readonly prog_name="$(basename "$0")"
readonly date="$(date +%Y%m%d_%H%M%S)"
# print help
usage() {
cat <<- EOF
usage: $prog_name options
parallel ssh executions.
OPTIONS:
-c --cmd CMD execute command CMD
-s --host SRV execute cmd on server SRV
-C --cmd CMD_FILE execute command contained in CMD_FILE
-S --hosts-file SRV_FILE execute cmd on all servers contained in SRV_FILE
-h --help show this help
Examples:
Run CMD on SERVER1 and SERVER2:
./$prog_name -s SERVER1 -s SERVER2 -c "CMD"
EOF
}
# test if an element is in an array
is_element(){
local search=$1; shift;
for e in "$@"; do [[ "$e" == "$search" ]] && return 0; done
return 1
}
# parse arguments
for arg in "$@"; do
case "$arg" in
--help) args+=( -h );;
--host) args+=( -s );;
--hosts-file) args+=( -S );;
--cmd) args+=( -c );;
--cmd-file) args+=( -C );;
*) args+=("$arg");;
esac
done
set -- "${args[@]}"
while getopts "hs:S:c:C:" OPTION; do
case $OPTION in
h) usage; exit 0;;
s) servers_array+=("$OPTARG");;
S) while read -r L; do servers_array+=("$L"); done < <( grep -vE "^ *(#|$)" "$OPTARG");;
c) cmd="$OPTARG";;
C) cmd="$(< "$OPTARG")"; file=$OPTARG;;
*) :;;
esac
done
if [[ -z ${servers_array[0]} ]] || [[ -z $cmd ]]; then
usage; exit 1
fi
# clean up created files at exit
trap "rm -f /tmp/pssh*$date" EXIT
[[ -n $file ]] && echo "executing command file : $file" || echo "executing command : $cmd"
# run cmd on each server
for i in "${!servers_array[@]}"; do
# executing cmd in subshell
ssh -n "${servers_array[$i]}" "$cmd" > "/tmp/pssh_${i}_${servers_array[$i]}_${date}" 2>&1 &
pid=$!
pids_array+=("$pid")
echo "${servers_array[$i]} - $pid"
done
# for each pid, set state to running
ps_state_array=( $(for i in "${!servers_array[@]}"; do echo "running"; done) )
echo "waiting for results..."
echo
# begin finished verifications
continue=true; attempt=0
while $continue; do
# foreach ps
for i in "${!pids_array[@]}"; do
# if already finished skip
[[ ${ps_state_array[$i]} == "finished" ]] && continue
# else check if finished
ps -o pid "${pids_array[$i]}" > /dev/null 2>&1 && ps_finished=false || ps_finished=true
if $ps_finished; then
ps_state_array[$i]="finished"
echo -e "[ ${servers_array[$i]} @ $(date +%H:%M:%S) ]" | grep '.*' --color=always
cat "/tmp/pssh_${i}_${servers_array[$i]}_${date}"
rm -f "/tmp/pssh_${i}_${servers_array[$i]}_${date}"
echo
fi
done
is_element "running" "${ps_state_array[@]}" || continue=false
if $continue; then
(( attempt < 5 )) && attempt=$(( attempt + 1 ))
sleep $attempt
fi
done
exit 0
Upvotes: 1
Reputation: 33685
With GNU Parallel it looks something like this:
doit() {
ip="$1"
echo "Checking $ip" >&2
current_date=$(date +%D)
printf "%-10s " "$current_date"
ssh -q -t $user@$ip 'sudo /tmp/script.sh' |
awk -F '|' '{printf("%-10s %-10s %-7s %-8s %-18s %-25s %-25s %-15s %-15s %-25s %-10s\n",$1,$2,$3,$4,$5,$6,$7,$8,$9,$10,$11); }'
}
export -f doit
export user
parallel -j0 doit ::: $IP_LIST >> $logfile
Upvotes: 1