Reputation: 12751
I have a cluster. On that cluster I have nodes. On those nodes I have a collection of different processes running (or not) that I would like to see a quick overview of. I wrote this bash script:
#!/usr/bin/env bash
set -o nounset
# Print out the Java process list and store it in a variable
readonly jps=$(jps -v)
# Declare two associative arrays
declare -A up
declare -A pid
# If the process in the in the saved list, set to 1, otherwise 0
up[accumulo_master]=$(echo ${jps} | grep -c 'Dapp=master')
up[accumulo_proxy]=$(echo ${jps} | grep -c 'Dapp=proxy')
# Store the PID of the process
pid[accumulo_master]=$(echo ${jps} | grep 'Dapp=master' | awk '{print $1}')
pid[accumulo_proxy]=$(echo ${jps} | grep 'Dapp=proxy' | awk '{print $1}')
echo Cluster Node Status: $(${wht})${me}$(${off})
echo -----------------------------------------
printf "%-28s %-5s %-5s\n" Component Up? PID
echo -----------------------------------------
for i in "${!up[@]}"
do
printf "%-28s %-5s %-5s\n" $i ${up[$i]} ${pid[$i]};
done |
sort
The real script is the same, but with more elements in both the up and pid associative arrays. This prints out something like the following:
Cluster Node Status: box1
-----------------------------------------
Component Up? PID
-----------------------------------------
accumulo_master 1 10493
accumulo_proxy 1 10493
The problem is, the PID is always the same - it always prints the first PID that it comes across, repeated for each row - and I don't understand why. If I change the script so that lines like this:
pid[accumulo_master]=$(echo ${jps} | grep 'Dapp=master' | awk '{print $1}')
look like this:
pid[accumulo_master]=$(jps -v | grep 'Dapp=master' | awk '{print $1}')
i.e. - if I run jps -v
every time, it works as expected - it just takes much much longer to run.
Any ideas?
Example output from jps -v
:
$ jps -v
10493 Main -Dapp=master -XX:+UseConcMarkSweepGC -XX:CMSInitiatingOccupancyFraction=75 -Djava.net.preferIPv4Stack=true -Xmx128m -Xms128m -XX:OnOutOfMemoryError=kill -9 %p -Djavax.xml.parsers.DocumentBuilderFactory=com.sun.org.apache.xerces.internal.jaxp.DocumentBuilderFactoryImpl -Djava.library.path=/home/hduser/hadoop/lib/native -Dorg.apache.accumulo.core.home.dir=/home/hduser/accumulo -Dhadoop.home.dir=/home/hduser/hadoop -Dzookeeper.home.dir=/home/hduser/zookeeper
16587 Jps -Dapplication.home=/usr/lib/jvm/java-7-openjdk-amd64 -Xms8m
10634 Main -Dapp=tracer -XX:+UseConcMarkSweepGC -XX:CMSInitiatingOccupancyFraction=75 -Djava.net.preferIPv4Stack=true -Xmx64m -Xms64m -XX:OnOutOfMemoryError=kill -9 %p -Djavax.xml.parsers.DocumentBuilderFactory=com.sun.org.apache.xerces.internal.jaxp.DocumentBuilderFactoryImpl -Djava.library.path=/home/hduser/hadoop/lib/native -Dorg.apache.accumulo.core.home.dir=/home/hduser/accumulo -Dhadoop.home.dir=/home/hduser/hadoop -Dzookeeper.home.dir=/home/hduser/zookeeper
10203 Main -Dapp=monitor -XX:+UseConcMarkSweepGC -XX:CMSInitiatingOccupancyFraction=75 -Djava.net.preferIPv4Stack=true -Xmx64m -Xms64m -XX:OnOutOfMemoryError=kill -9 %p -Djavax.xml.parsers.DocumentBuilderFactory=com.sun.org.apache.xerces.internal.jaxp.DocumentBuilderFactoryImpl -Djava.library.path=/home/hduser/hadoop/lib/native -Dorg.apache.accumulo.core.home.dir=/home/hduser/accumulo -Dhadoop.home.dir=/home/hduser/hadoop -Dzookeeper.home.dir=/home/hduser/zookeeper
$ jps -v | grep 'Dapp=master'
10493 Main -Dapp=master -XX:+UseConcMarkSweepGC -XX:CMSInitiatingOccupancyFraction=75 -Djava.net.preferIPv4Stack=true -Xmx128m -Xms128m -XX:OnOutOfMemoryError=kill -9 %p -Djavax.xml.parsers.DocumentBuilderFactory=com.sun.org.apache.xerces.internal.jaxp.DocumentBuilderFactoryImpl -Djava.library.path=/home/hduser/hadoop/lib/native -Dorg.apache.accumulo.core.home.dir=/home/hduser/accumulo -Dhadoop.home.dir=/home/hduser/hadoop -Dzookeeper.home.dir=/home/hduser/zookeeper
$ jps -v | grep 'Dapp=proxy' | awk '{print $1}'
16898
Upvotes: 0
Views: 97
Reputation: 12751
Okay, so I figured this out just after I posted. It was, as it always is with bash, a quoting problem.
If you do this:
readonly jps=$(jps -v)
echo $jps
It will print the whole multi-line output from jps -v
on one single line. This means that awk {print $1}
will always print out the first value - because it's line based and there's only one line.
If instead you do this:
readonly jps=$(jps -v)
echo "$jps"
It prints out with multi-lines intact, which was what I was expecting and makes everything work.
I just needed to change the script so lines like this:
pid[accumulo_master]=$(echo ${jps} | grep 'Dapp=master' | awk '{print $1}')
look like hits:
pid[accumulo_master]=$(echo "${jps}" | grep 'Dapp=master' | awk '{print $1}')
Upvotes: 1