Reputation: 5444
I have three files, a, b, and c. c has has a list of codes. b has two columns: a column of codes and their corresponding test
name. The last file, a, has a list of names which contain (as substrings) all test names. Examples:
c:
codeb coded codea codec codee codef codee codeg codeh
b:
codea testa codeb testb codec testc coded testa codee testa codef testb codeg testc codeh testa
a:
testa1234 testb21345 14231testcAr
I want to output the corresponding name in a file for each code in c. For example, codeb
should output testb21345
. I haven't been able to make it work. I think this has to do with grep not understanding the pattern. This is the loop I have written as MVE:
diractual=$PWD
while read line; do
ca=$(grep $line $diractual/b | cut -f 2)
ca_complete=$(grep $ca $diractual/a)
echo "This is ca:"
echo "$ca"
echo "This is ca_complete:"
echo "$ca_complete"
done <$diractual/c
The two echo
s should output, for example for codeb
(the first line in c):
This is ca:
testb
This is ca_complete:
testb21345
But it outputs (for every line):
This is ca:
testb
This is ca_complete:
#(Empty line)
So the first grep
is finding the correct test
and it is storing it in variable ca
but the second one is not finding this pattern in a.
Upvotes: 0
Views: 114
Reputation: 63902
If I understand right
filea="a"
fileb="b"
filec="c"
while read -r code
do
printf "%s: %s\n" "$code" "$(grep "$(grep -oP "^$code\s+\K.*" "$fileb")" "$filea")"
done < "$filec"
prints
codeb: testb21345
coded: testa1234
codea: testa1234
codec: 14231testcAr
codee: testa1234
codef: testb21345
codee: testa1234
codeg: 14231testcAr
codeh: testa1234
or divided into alone steps
while read -r code
do
tst=$(grep -oP "^$code\s+\K.*" "$fileb")
res=$(grep "$tst" "$filea")
printf "%s\t%s\t%s\n" "$code" "$tst" "$res"
done < "$filec"
prints
codeb testb testb21345
coded testa testa1234
codea testa testa1234
codec testc 14231testcAr
codee testa testa1234
codef testb testb21345
codee testa testa1234
codeg testc 14231testcAr
codeh testa testa1234
Upvotes: 0
Reputation: 33317
Rather than using bash and grep, it would be simpler and likely faster to use a single awk invocation to produce the desired output. For example, with GNU awk for thr ARGIND
variable, you can write:
$ gawk 'ARGIND==1{a[$1]=$2}ARGIND==2{b[$1]}ARGIND==3{for(i in b) if ($0 ~ a[i]) print i, $0}' b c a
codeh testa1234
codea testa1234
coded testa1234
codee testa1234
codef testb21345
codeb testb21345
codeg 14231testcAr
codec 14231testcAr
In a more readable format it would be:
gawk ' ARGIND == 1 { a[$1] = $2 }
ARGIND == 2 { b[$1] }
ARGIND == 3 {
for (i in b)
if ($0 ~ a[i])
print i, $0
}' b c a
Upvotes: 1