Reputation: 3
I'd like to compare two files and delete lines in file1 if they contain a pattern found anywhere in file2. I did some searching and the closest answers I've been able to find were how to delete lines that appear in another file.
I'd like a simple grep, awk, sed, etc one-liner if possible. I'm matching on IP addresses, as shown below.
file1
10.10.50.1 00:00:10:23 0000.0012.3456 Vlan1
10.10.50.2 00:00:12:34 1234.56AB.CDEF Vlan2
10.10.50.3 00:00:23:10 ABCD.EF12.345 Vlan3billion
file2
these-are some_words 10.10.50.2 andmaybe some-other words
theseare somewords 10.10.50.99 and-maybe some_other words
Expected output:
10.10.50.1 00:00:10:23 0000.0012.3456 Vlan1
10.10.50.3 00:00:23:10 ABCD.EF12.345 Vlan3billion
Upvotes: 0
Views: 717
Reputation: 15206
More awk ... at the core snaffled from karafka ..
$ awk 'NR==FNR{a[gensub(/^.* (([0-9]{1,3}\.){3}[0-9]{1,3}) .*$/,"\\1",1,$0)];next} !($1 in a)' file2 file1
10.10.50.1 00:00:10:23 0000.0012.3456 Vlan1
10.10.50.3 00:00:23:10 ABCD.EF12.345 Vlan3billion
Upvotes: 0
Reputation: 124648
If I understand correctly, you want to exclude from the first file lines that would match any IP address in the second file.
This simple and admittedly a bit lazy solution might be good enough for your purpose:
grep -v file1 -Fwf <(awk '{ print $3 }' file2)
The Awk extracts the 3rd column with IP addresses,
and grep
will use those as fixed patterns (-F
) and only match complete words (-w
).
If the IP address is not always the 3rd column,
then you could extract them by using pattern matching with grep
,
as @tripleee suggested:
grep -v file1 -Fwf <(grep -owE '[1-9][0-9](\.[0-9]{1,3}){3}' file2)
Upvotes: 1
Reputation: 67467
awk
to the rescue!
$ awk 'NR==FNR{a[$3];next} !($1 in a)' file2 file1
10.10.50.1 00:00:10:23 0000.0012.3456 Vlan1
10.10.50.3 00:00:23:10 ABCD.EF12.345 Vlan3billion
Upvotes: 0