XYZ_Allen
XYZ_Allen

Reputation: 323

Find exact word from a line and delete that line with grep/sed

I searched few question and answers on Stack Overflow, but none of them works for my case, and I don't know why my regular expression doesn't work. I'd really appreciate if you can point out my wrong thought.

Test case: text file contains

AllenZhou:9175186661:111th 1111 NY, 11111
XiaoyuZhou:9175186662:2222 222th 22222 NY 22222
Allen:1231231234:abc rd, PA

Here is my function:

checkEntry(){
    vaildName=true
    while read entry
    do
            if $( echo $entry | grep --quiet $name)//$name read from keyboard
            then
                    vaildName=false
            fi
    done < $fileName
}

If I enter Zhou, my function will return both AllenZhou, and XiaoyuZhou. After I did small study, I changed the grep command parameter to

if $( echo $entry | grep --quiet ^$name:$)

It turns out that it never finds anything for AllenZhou or XiaoyuZhou – I am confused.

sed  -i -n /$name/d $fileName

This is the code I use to delete lines that contains the string pattern. The problem is like with grep, if I type Zhou or Allen, the command will delete both lines that contain the keyword. But when I change to

sed  -i -n /\<$name\>/d $fileName

it won't delete for AllenZhou or XiaoyuZhou or Zhou... Again I am confused.

Upvotes: 0

Views: 1333

Answers (1)

tripleee
tripleee

Reputation: 189387

Using a command substition in the if is not doing what you think. You are capturing the output from grep -- which with the -q option will always be an empty string -- and passing that as the argument to if, which expects a command name or pipeline as its argument. It basically tries to execute the empty string, which of course doesn't do anything useful (the net effect is that the if condition will always succeed).

You want simply

if echo "$entry" | grep -q "$name"; then
    : stuff
fi

or more idiomatically and efficiently

if [[ "$entry" = *"$name"* ]]; then
    : stuff
fi

or even

case $entry in *"$name"*)
    : stuff;;
esac

(The double square brackets [[ ... ]] are Bash only, while case is portable to any POSIX shell, and even to the original Bourne shell. Single square brackets would also be portable and they can do ... something like this, but it's ugly, brittle, and more complex than you'd like.)

Also pay attention to the quoting. A variable containing an arbitrary string needs to be quoted.

As another aside, you want to use read -r -- without options, the behavior of read is burdened with pesky legacy behavior for historical backwards compatibility in some corner cases.

However, examining each line separately is just cumbersome. The entire function could be

grep -q "$name" "$fileName"

which also returns an actual result; something your function failed to do (except perhaps by setting a global variable, if that's what it's doing -- hard to tell from context. Even in shell script, using global variables in functions is a bad idea).

Perhaps you'll want some regex anchoring to restrict matching to the first field. grep "^[^:]*$name" looks for a match anywhere before the first colon.

There are no word boundaries in your data (whitespace, punctuation, etc), just variations in capitalization, so there is no way for \< or \> to match on these names. Observing your capitalization patterns, perhaps you want to require either an uppercase character or a colon after the match; "^[^:]*$name[[:upper:]:]"?

If the final goal is to extract an address or phone number, just do that directly instead. You'll want Awk instead of grep for that.

awk -F : -v name="$name" 'BEGIN { pat = name "($|[[:upper:]])"; result = 1 }
    $1 ~ pat ( print $2; result = 0 }
    END { exit result }' "$fileName"

The Awk script prints the second field from any matching line and sets a result code, so you can use it in an if or while condition.

Upvotes: 2

Related Questions