Duncan Lock
Duncan Lock

Reputation: 12751

Different greps over a variable value in a bash script always print out the same value

I have a cluster. On that cluster I have nodes. On those nodes I have a collection of different processes running (or not) that I would like to see a quick overview of. I wrote this bash script:

#!/usr/bin/env bash
set -o nounset

# Print out the Java process list and store it in a variable
readonly jps=$(jps -v)

# Declare two associative arrays
declare -A up
declare -A pid

# If the process in the in the saved list, set to 1, otherwise 0
up[accumulo_master]=$(echo ${jps} | grep -c 'Dapp=master')
up[accumulo_proxy]=$(echo ${jps} | grep -c 'Dapp=proxy')

# Store the PID of the process
pid[accumulo_master]=$(echo ${jps} | grep 'Dapp=master' | awk '{print $1}')
pid[accumulo_proxy]=$(echo ${jps} | grep 'Dapp=proxy' | awk '{print $1}')

echo Cluster Node Status: $(${wht})${me}$(${off})
echo -----------------------------------------
printf "%-28s %-5s %-5s\n" Component Up? PID
echo -----------------------------------------

for i in "${!up[@]}"
do
    printf "%-28s %-5s %-5s\n" $i ${up[$i]} ${pid[$i]};
done |
sort

The real script is the same, but with more elements in both the up and pid associative arrays. This prints out something like the following:

Cluster Node Status: box1
-----------------------------------------
Component                    Up?   PID  
-----------------------------------------
accumulo_master              1    10493
accumulo_proxy               1    10493

The problem is, the PID is always the same - it always prints the first PID that it comes across, repeated for each row - and I don't understand why. If I change the script so that lines like this:

pid[accumulo_master]=$(echo ${jps} | grep 'Dapp=master' | awk '{print $1}')

look like this:

pid[accumulo_master]=$(jps -v | grep 'Dapp=master' | awk '{print $1}')

i.e. - if I run jps -v every time, it works as expected - it just takes much much longer to run.

Any ideas?


Example output from jps -v:

$ jps -v

10493 Main -Dapp=master -XX:+UseConcMarkSweepGC -XX:CMSInitiatingOccupancyFraction=75 -Djava.net.preferIPv4Stack=true -Xmx128m -Xms128m -XX:OnOutOfMemoryError=kill -9 %p -Djavax.xml.parsers.DocumentBuilderFactory=com.sun.org.apache.xerces.internal.jaxp.DocumentBuilderFactoryImpl -Djava.library.path=/home/hduser/hadoop/lib/native -Dorg.apache.accumulo.core.home.dir=/home/hduser/accumulo -Dhadoop.home.dir=/home/hduser/hadoop -Dzookeeper.home.dir=/home/hduser/zookeeper
16587 Jps -Dapplication.home=/usr/lib/jvm/java-7-openjdk-amd64 -Xms8m
10634 Main -Dapp=tracer -XX:+UseConcMarkSweepGC -XX:CMSInitiatingOccupancyFraction=75 -Djava.net.preferIPv4Stack=true -Xmx64m -Xms64m -XX:OnOutOfMemoryError=kill -9 %p -Djavax.xml.parsers.DocumentBuilderFactory=com.sun.org.apache.xerces.internal.jaxp.DocumentBuilderFactoryImpl -Djava.library.path=/home/hduser/hadoop/lib/native -Dorg.apache.accumulo.core.home.dir=/home/hduser/accumulo -Dhadoop.home.dir=/home/hduser/hadoop -Dzookeeper.home.dir=/home/hduser/zookeeper
10203 Main -Dapp=monitor -XX:+UseConcMarkSweepGC -XX:CMSInitiatingOccupancyFraction=75 -Djava.net.preferIPv4Stack=true -Xmx64m -Xms64m -XX:OnOutOfMemoryError=kill -9 %p -Djavax.xml.parsers.DocumentBuilderFactory=com.sun.org.apache.xerces.internal.jaxp.DocumentBuilderFactoryImpl -Djava.library.path=/home/hduser/hadoop/lib/native -Dorg.apache.accumulo.core.home.dir=/home/hduser/accumulo -Dhadoop.home.dir=/home/hduser/hadoop -Dzookeeper.home.dir=/home/hduser/zookeeper

$ jps -v  | grep 'Dapp=master'

10493 Main -Dapp=master -XX:+UseConcMarkSweepGC -XX:CMSInitiatingOccupancyFraction=75 -Djava.net.preferIPv4Stack=true -Xmx128m -Xms128m -XX:OnOutOfMemoryError=kill -9 %p -Djavax.xml.parsers.DocumentBuilderFactory=com.sun.org.apache.xerces.internal.jaxp.DocumentBuilderFactoryImpl -Djava.library.path=/home/hduser/hadoop/lib/native -Dorg.apache.accumulo.core.home.dir=/home/hduser/accumulo -Dhadoop.home.dir=/home/hduser/hadoop -Dzookeeper.home.dir=/home/hduser/zookeeper

$ jps -v  | grep 'Dapp=proxy' | awk '{print $1}'

16898

Upvotes: 0

Views: 97

Answers (1)

Duncan Lock
Duncan Lock

Reputation: 12751

Okay, so I figured this out just after I posted. It was, as it always is with bash, a quoting problem.

If you do this:

readonly jps=$(jps -v)
echo $jps

It will print the whole multi-line output from jps -v on one single line. This means that awk {print $1} will always print out the first value - because it's line based and there's only one line.

If instead you do this:

readonly jps=$(jps -v)
echo "$jps"

It prints out with multi-lines intact, which was what I was expecting and makes everything work.

I just needed to change the script so lines like this:

pid[accumulo_master]=$(echo ${jps} | grep 'Dapp=master' | awk '{print $1}')

look like hits:

pid[accumulo_master]=$(echo "${jps}" | grep 'Dapp=master' | awk '{print $1}')

Upvotes: 1

Related Questions