Reputation: 2356
I have an idFile
:
1006006
1006008
1006011
1007002
......
and famFile
:
1006 1006001 1006016 1006017 1
1006 1006006 1006016 1006017 1
1006 1006007 0 0 2
1006 1006008 1006007 1006006 2
1006 1006010 1006016 1006017 2
1006 1006011 1006016 1006017 1
1006 1006016 0 0 2
1006 1006017 0 0 1
1007 1007001 1007950 1007015 2
1007 1007002 1007014 1007015 2
......
I need to grep all the lines from famFile
where the second column does not match any of the lines in idFile
.
awk 'BEGIN { while(getline <"idFile") id[$0]=1; }
id[$2] ' famFile
returns all the matches:
1006 1006006 1006016 1006017 1
1006 1006008 1006007 1006006 2
1006 1006011 1006016 1006017 1
1007 1007002 1007014 1007015 2
......
But how can I modify the command to get the complement of the matches?
Upvotes: 1
Views: 1653
Reputation: 37424
$ awk 'NR==FNR{a[$1];next} !($2 in a)' idFile famFile
1006 1006001 1006016 1006017 1
1006 1006007 0 0 2
1006 1006010 1006016 1006017 2
1006 1006016 0 0 2
1006 1006017 0 0 1
1007 1007001 1007950 1007015 2
Explained:
$ awk '
NR==FNR { # process the idFile
a[$1] # hash to a
next # next id
}
!($2 in a) # if the second field id is not in a, output record
' idFile famFile # mind the file order
Upvotes: 2