Reputation: 727
I would like to print a row of input file 11 if contains less than two strings found in NV_11.tab. Now is not catching strings in file 11 because is looking for exact match. Script needs a cleaning to catch them. I tried adding [^0-9] next to $i but seems this is not allowed.
Thanks, Bernardo
awk 'NR==FNR{a[$1]++; next}
{
c=0;
for(i=2;i<=NF;i++){
if($i in a){c++}
}
}
c<=1;' NV_1.tab 11
#NV_1.tab
HS302
HS303
HS304
HS305
HS319
HS321
HS322
HS323
HS324
HS326
HS327
HS328
HS329
HS330
HS331
HS332
HPSD74
#11
HPNK_11595 HS302_01873 HS303_01073
HPNK_11596 HPNK_11596 HPS_02673 HS302_01873
#current output
HPNK_11595 HS302_01873 HS303_01073
HPNK_11596 HPNK_11596 HPS_02673 HS302_01873
#desired output
HPNK_11596 HPNK_11596 HPS_02673 HS302_01873
Upvotes: 0
Views: 25
Reputation: 80931
The simplest way I see to do this is something like this.
Inside the for
loop add
s=$i
gsub(/_.*$/, "", s)
and then replace ($i in a)
with (s in a)
.
Upvotes: 1