Reputation: 3022
The awk
below will look for the ids in file1
in $2
of file2
and if they match print the $2
. If an id is missing or not found in file2
(like ARRR and AAAA), I can not figure out how to add it to them to the lines in the output as missing in $3
following the same format. That is with the next sequential number in $1
, the id from file1
in $2
, and the word missing in $3
. Thank you :).
awk
awk -F'\t' 'NR==FNR{A[$1];next}$2 in A' file1 file2
file1 space delimited
AARS
AARS2
AARS2;TMEM151B
ARRR
AAAS
AAAA
AADAC
file2 tab-delimited
1 AARS 100.00
2 AARS2 100.00
3 AARS2;TMEM151B 100.00
4 AAAS 100.00
5 AADAC 100.00
desired output tab-delimited
1 AARS 100.00
2 AARS2 100.00
3 AARS2;TMEM151B 100.00
4 AAAS 100.00
5 AADAC 100.00
6 ARRR missing
7 AAAA missing
Upvotes: 1
Views: 129
Reputation: 92854
awk solution:
awk 'NR==FNR{ a[$0]; next }$2 in a{ delete a[$2] }
END{ for(i in a) print ++FNR,i,"missing" }1' file1 OFS='\t' file2
The output:
1 AARS 100.00
2 AARS2 100.00
3 AARS2;TMEM151B 100.00
4 AAAS 100.00
5 AADAC 100.00
6 AAAA missing
7 ARRR missing
Upvotes: 2