Reputation: 2610
I have the following file
more /etc/hosts
23.1.22.162 kafka01.dfg.com
23.1.22.155 kafka02.dfg.com
23.1.22.222 kafka03.dfg.com
23.1.22.111 master01.dfg.com
23.1.22.239 master02.dfg.com
23.1.22.170 master03.dfg.com
23.1.22.167 worker01.dfg.com
23.1.22.165 worker02.dfg.com
23.1.22.112 worker03.dfg.com
We want to capture all master
and worker
machines when kafka_name=""
with egrep
so we did that
kafka_name=""
egrep "\smaster|\sworker|\s$kafka_name" /etc/hosts
but we still get hosts included kafka machines as
egrep "\smaster|\sworker|\s$kafka_name" /etc/hosts
127.0.0.1 localhost localhost.localdomain localhost4 localhost4.localdomain4
::1 localhost localhost.localdomain localhost6 localhost6.localdomain6
23.1.22.162 kafka01.dfg.com
23.1.22.155 kafka02.dfg.com
23.1.22.222 kafka03.dfg.com
23.1.22.111 master01.dfg.com
23.1.22.239 master02.dfg.com
23.1.22.170 master03.dfg.com
23.1.22.167 worker01.dfg.com
23.1.22.165 worker02.dfg.com
23.1.22.112 worker03.dfg.com
anyway when we set
kafka_name="kafka"
we also get the kafka machines as
egrep "\smaster|\sworker|\s$kafka_name" /etc/hosts
23.1.22.162 kafka01.dfg.com
23.1.22.155 kafka02.dfg.com
23.1.22.222 kafka03.dfg.com
23.1.22.111 master01.dfg.com
23.1.22.239 master02.dfg.com
23.1.22.170 master03.dfg.com
23.1.22.167 worker01.dfg.com
23.1.22.165 worker02.dfg.com
23.1.22.112 worker03.dfg.com
so why when we set
kafka_name=""
does it still print the kafka machines from hosts despite $kafka_name
being null?
Upvotes: 0
Views: 85
Reputation: 203324
FYI egrep
is deprecated in favor of grep -E
.
Consider using awk instead, though, for clear, simple control over whatever conditions (not just regexps - conditions) you want to express, e.g.:
$ kafka_name=''
$ awk -v kafka_name="$kafka_name" '( $2 ~ /^(master|worker)/ ) || ( (kafka_name != "") && ($2 ~ ("^"kafka_name)) )' file
23.1.22.111 master01.dfg.com
23.1.22.239 master02.dfg.com
23.1.22.170 master03.dfg.com
23.1.22.167 worker01.dfg.com
23.1.22.165 worker02.dfg.com
23.1.22.112 worker03.dfg.com
$ kafka_name='kafka02'
$ awk -v kafka_name="$kafka_name" '( $2 ~ /^(master|worker)/ ) || ( (kafka_name != "") && ($2 ~ ("^"kafka_name)) )' file
23.1.22.155 kafka02.dfg.com
23.1.22.111 master01.dfg.com
23.1.22.239 master02.dfg.com
23.1.22.170 master03.dfg.com
23.1.22.167 worker01.dfg.com
23.1.22.165 worker02.dfg.com
23.1.22.112 worker03.dfg.com
The above will work using any awk in any shell on every Unix box.
It is using regexp rather than string comparisons, though, just like in your egrep
command was doing, and so if any of those names can contain regexp metachars you'd need to escape them or change the script to use index($2,string) == 1
everywhere instead of $2 ~ /^regexp/
, e.g.:
$ awk -v kafka_name="$kafka_name" '(index($2,"master") == 1) || (index($2,"worker") == 1) || ( (kafka_name != "") && (index($2,kafka_name) == 1) )' file
23.1.22.155 kafka02.dfg.com
23.1.22.111 master01.dfg.com
23.1.22.239 master02.dfg.com
23.1.22.170 master03.dfg.com
23.1.22.167 worker01.dfg.com
23.1.22.165 worker02.dfg.com
23.1.22.112 worker03.dfg.com
Upvotes: 3
Reputation: 780851
When $kafka_name
is empty, the pattern is "\smaster|\sworker|\s"
, and the last alternative matches any line with a space, so it matches everything.
One option is to set $kafka_name
to something you know will never exist instead of an empty string, e.g.
kafka_name=kafkaXXXX
Another is to add $kafka_name
to the pattern only when it's not empty.
pattern="\smaster|\sworker"
if [ -n "$kafka_name" ]
then pattern="$pattern|\s$kafka_name"
fi
egrep "$pattern" /etc/hosts
Upvotes: 2