handora
handora

Reputation: 659

Why grep can't filter the right line in shell script?

The code is as follow:

#!/bin/bash

file=$(cat b.txt | tr "[:upper:]" "[:lower:]" | tr -c "[:alnum:]" '\n' | grep -v "^$")

cat a.txt | while read line; do
    file=$(echo "$file" | grep -ov "$line")
done
echo "$file" | sort | uniq -c | sort -n

The above a.txt is a file with a paragraph and b.txt is file with some words each line, and i just want to delete these words in b.txt from a.txt, while the answer is wrong.

for example:

b.txt

Hello, i want to go to school

a.txt

hello
i

Expected result:

  1 go
  1 school
  1 want
  2 to

Actual result:

  1 go
  1 hello
  1 i
  1 school
  1 want
  2 to

while my answer also include the word included in b.txt.

Upvotes: 1

Views: 137

Answers (2)

William Pursell
William Pursell

Reputation: 212248

It's the classic subshell variable problem, caused by the useless use of cat. Since you are piping cat to the while loop and pipelines run in a subshell, the assignments to the variable file affect the subshell only. The original variable is not modified. Easiest solution is to write while read line; do ... done < a.txt

A second solution is to echo the variable from the subshell:

cat a.txt | { while read line; do
    file=$(echo "$file" | grep -ov "$line")
done
echo "$file" | sort | uniq -c | sort -n
}

Upvotes: 3

luoluo
luoluo

Reputation: 5533

AWK approach:

awk 'BEGIN {
    FS="[,[:space:]]";
}
{
    if (NR==FNR) {
        for(i=1;i<=NF;++i){
            black[tolower($i)]=1;
        }
        next;
    }
    for (i=1; i<=NF; ++i) {
        if ($i && black[tolower($i)]!=1) {
            target[tolower($i)]+=1;
        }
    }
}
END {
    for (i in target) {
        print target[i], i;
    }
}' a.txt b.txt

OUTPUT

1 want
1 go
2 to
1 school

Upvotes: 1

Related Questions