justaguy
justaguy

Reputation: 3022

awk to remove line if match in field is found

I am trying to remove a line in file2 if a match is found in file1. The match in file2 will be in a specific field $5 before the -. The awk below doesn't specify a field to search but does run, and is hopefully a start. Thank you :).

file1

AGRN
ABL
SCN1A

file2

chr1    955543  955763  chr1:955543-955763  AGRN-6|gc=75
chr1    957571  957852  chr1:957571-957852  AGRN-7|gc=61.2
chr1    970621  970740  chr1:970621-970740  BCR-8|gc=57.1
chr1    976035  976270  chr1:976035-976270  BCR-9|gc=74.5  

desired output (AGRN removed as it is in file1)

chr1    970621  970740  chr1:970621-970740  BCR-8|gc=57.1
chr1    976035  976270  chr1:976035-976270  BCR-9|gc=74.5 

awk

awk '!/file1/' file2

Upvotes: 1

Views: 660

Answers (2)

justaguy
justaguy

Reputation: 3022

I came up with this awk to confirm the grep

awk '
FILENAME == ARGV[1] {to_remove[$1]=1; next}
! ($5 in to_remove) {print}' file1 file2

Thanks again :)

Upvotes: 0

hek2mgl
hek2mgl

Reputation: 157947

Use grep for that:

grep -vFf file1 file2

-f reads search patterns from file1, -v negates the match meaning remove lines from file2 if one of the patterns in file1 matches. -F uses fixed strings for the search instead of regex patterns. Unless you explicitly placed regular expressions in file1 -F is most likely what you want.

Upvotes: 1

Related Questions