Reputation: 514
I am currently busy creating a script to kickstart servers (with CentOS 6.x and CentOS 7.x) remotely. So far the script is working, but hangs on one minor thing. Well actually it does not hang, but it does not give detailed information about what is happening. In other words, I am not getting the correct information back in bash about the job being finished correctly.
I have tried various things, however it's hanging with the following message (which is being repeated endlessly):
servername is still installing and configuring packages...
PING 100.125.150.175 (100.125.150.175) 56(84) bytes of data.
64 bytes from 100.125.150.175: icmp_seq=1 ttl=63 time=0.152 ms
64 bytes from 100.125.150.175: icmp_seq=2 ttl=63 time=0.157 ms
64 bytes from 100.125.150.175: icmp_seq=3 ttl=63 time=0.157 ms
64 bytes from 100.125.150.175: icmp_seq=4 ttl=63 time=0.143 ms
64 bytes from 100.125.150.175: icmp_seq=5 ttl=63 time=0.182 ms
--- 100.125.150.175 ping statistics ---
5 packets transmitted, 5 received, 0% packet loss, time 120025ms
rtt min/avg/max/mdev = 0.143/0.158/0.182/0.015 ms
servername is still installing and configuring packages...
PING 100.125.150.175 (100.125.150.175) 56(84) bytes of data.
64 bytes from 100.125.150.175: icmp_seq=1 ttl=63 time=0.153 ms
64 bytes from 100.125.150.175: icmp_seq=2 ttl=63 time=0.132 ms
64 bytes from 100.125.150.175: icmp_seq=3 ttl=63 time=0.142 ms
etc....
So for some reason it does not contine to the next line of code or does the next action. Since it's only feedback to me (or another user), it's not a majorissue. But it would be nice to get this functional and providing (detailed) information back about the current progress or what the script/server is actually doing at the moment. This is not the case for the above (last) piece of code unfortunately.
This is the current code snippet I have (yes, it's a mess):
while true;
do
#ping -c3 -i3 $HWNODEIP > /dev/null
#ping -c5 -i30 $HWNODEIP > /dev/null
ping -c5 -i30 $HWNODEIP
if [ $? -eq 1 ] || [ $? -eq 2 ] || [ $? -eq 68 ]
then
echo -e " "
echo -e "Kickstart part II also done. $HOSTNAME will be rebooted one more time."
sleep 5
######return 0
echo -e " "
printf "%s" "Waiting for $HOSTNAME to come back online: "
while ! ping -c 1 -n -w 30 $HWNODEIP &> /dev/null
do
printf "%c" "."
#sleep 10
done
echo -e " "
echo -e "Reboot is done and $HOSTNAME is back online. Performing final check. Please wait..."
sleep 10
echo -e " "
sudo /usr/local/collectHWdata.pl $HWNODEIP
ssh root@$HWNODEIP "while ! test -e /root/kickstart-DONE; do sleep 3; done; echo KICKSTART IS DONE\!"
echo -e " "
exit
else
echo -e " "
echo -e "$HOSTNAME is still installing and configuring packages..."
fi
done
Sidenote: I removed > /dev/null #5 for debugging (not that it helped)
I am guessing I am using things incorrectly and I am by no means a experienced scripter; I can only do minor stuff, but ofcourse I am doing my best. I have been fooling around with this since last week and still no result on this part.
The server is rebooted after the selected CentOS version, creating partitions and setting up the network. This all works. The above snippet is after that reboot. Now it will install packages I selected, configure various things (like Nagios) and install/compile certain PERL modules. And a few other minor things.
This is done correctly in the background. I wanted to make the script (the above piece of code) that the server is still busy with installing things and such. Since I lack the knowledge to do that, I decided for a different approach; check if the server is online (in other words that it's still installing). As long as the server is online, it's still installing/configuring things obviously. After that is done, the server will reboot once more to perform the final 2 commands (as seen in my snippet). However (here is the problem) it never does those commands, though the kickstart is completely done.
So I am guessing I am doing something wrong and even might messed up things (or got confused by doing so). Maybe someone has an idea, solution or a completely different approach to tackle and fix this problem (or at least I hope so).
Other things I have tried so far? Well I tried a various of ping commands and I also tried nc (netcat) but also without a good result. I every single time hit a brick wall with the last 2 commands and it keeps pinging instead of showing that the kickstart was done... I think I have spend several hours (since last week) on this already without getting anywhere.
So I am hoping someone can take a look at this and tell me what I am doing wrong and maybe there is a better approach (other than pinging a server) to see if it's still busy. Maybe a (remote) check on yum, perl or a service, so that the script knows it's still busy.
Sorry for the long post, but I know when I provide as much information as possible including code examples and results, this is more "appreciated". So I am hoping I provided adequate information. If not, let me know. I will try to add as much information as I can. As always I am always willing to learn or change my approach.
Thank you already for reading my post!
Upvotes: 1
Views: 228
Reputation: 968
As noted in the comments under the question:
The server may already be rebooted by the time ping -c5 -i30 $HWNODEIP
finishes. The command sends 5 packets (-c
flag), waiting 30 seconds between each packet (-i interval
flag). So thats's 5*30 = 150 seconds, which is a bit more than 2 minutes. A server could reboot just fine within 2 minutes, especially if there's SSD in use. So try lowering the total time it would take this command to complete.
[ $? -eq 68 ]
is probably unnecessary. $HWNODEIP
is just ip address, and exit code 68 is for domain name not being resolved, which doesn't apply to IP addresses.
The if
statement could be simplified to
if ! ping -c5 -i30 "$HWNODEIP"
These are minor suggestions,probably not bulletproof. As confirmed by OP in the comments, lowering interval helps. There's other small improvements that could be done (like quoting variables), but that's outside the scope of the question, so I'll leave it for now.
Upvotes: 1