D1X
D1X

Reputation: 5444

Using grep output as pattern for another grep

I have three files, a, b, and c. c has has a list of codes. b has two columns: a column of codes and their corresponding test name. The last file, a, has a list of names which contain (as substrings) all test names. Examples:

c:

codeb
coded
codea
codec
codee
codef
codee
codeg
codeh

b:

codea   testa
codeb   testb
codec   testc
coded   testa
codee   testa
codef   testb
codeg   testc
codeh   testa

a:

testa1234
testb21345
14231testcAr

I want to output the corresponding name in a file for each code in c. For example, codeb should output testb21345. I haven't been able to make it work. I think this has to do with grep not understanding the pattern. This is the loop I have written as MVE:

diractual=$PWD

while read line; do

        ca=$(grep $line $diractual/b | cut -f 2)  
        ca_complete=$(grep $ca $diractual/a)
        echo "This is ca:"
        echo "$ca"
        echo "This is ca_complete:"
        echo "$ca_complete"
done <$diractual/c

The two echos should output, for example for codeb (the first line in c):

        This is ca:
        testb
        This is ca_complete:
        testb21345

But it outputs (for every line):

        This is ca:
        testb
        This is ca_complete:

        #(Empty line)

So the first grep is finding the correct test and it is storing it in variable ca but the second one is not finding this pattern in a.

Upvotes: 0

Views: 114

Answers (2)

clt60
clt60

Reputation: 63902

If I understand right

filea="a"
fileb="b"
filec="c"
while read -r code
do
        printf "%s: %s\n" "$code" "$(grep "$(grep -oP "^$code\s+\K.*" "$fileb")" "$filea")"
done < "$filec"

prints

codeb: testb21345
coded: testa1234
codea: testa1234
codec: 14231testcAr
codee: testa1234
codef: testb21345
codee: testa1234
codeg: 14231testcAr
codeh: testa1234

or divided into alone steps

while read -r code
do
        tst=$(grep -oP "^$code\s+\K.*" "$fileb")
        res=$(grep "$tst" "$filea")
        printf "%s\t%s\t%s\n" "$code" "$tst"  "$res"
done < "$filec"

prints

codeb   testb   testb21345
coded   testa   testa1234
codea   testa   testa1234
codec   testc   14231testcAr
codee   testa   testa1234
codef   testb   testb21345
codee   testa   testa1234
codeg   testc   14231testcAr
codeh   testa   testa1234

Upvotes: 0

user000001
user000001

Reputation: 33317

Rather than using bash and grep, it would be simpler and likely faster to use a single awk invocation to produce the desired output. For example, with GNU awk for thr ARGIND variable, you can write:

$ gawk 'ARGIND==1{a[$1]=$2}ARGIND==2{b[$1]}ARGIND==3{for(i in b) if ($0 ~ a[i]) print i, $0}' b c a
codeh testa1234
codea testa1234
coded testa1234
codee testa1234
codef testb21345
codeb testb21345
codeg 14231testcAr
codec 14231testcAr

In a more readable format it would be:

gawk ' ARGIND == 1 { a[$1] = $2 } 
       ARGIND == 2 { b[$1] }
       ARGIND == 3 {
           for (i in b) 
               if ($0 ~ a[i])
                   print i, $0
       }' b c a

Upvotes: 1

Related Questions