How to remove lines contained in file 1 from file 2 if in file 2 they are prefixed?

Question

I have the following situation:

source.txt

ID1:email1@domain1.com
ID2:email2@domain2.com
ID3:email3@domain3.com
...

IDs are numeric strings, e.g. 1234, 23412, 897... (one or more digits).

exclude.txt

emailX@domainX.com
emailY@domainY.com
emailZ@domainZ.com
...

i.e. only emails, no IDs.

I want to remove all lines from source.txt which contain emails listed in exclude.txt, preserving the ID:email pairs for the lines which are not removed.

How can I do that with linux command line tools (or simple bash script if needed)?

George Vasiliou · Accepted Answer

You can do it easily with awk:

awk -F":" 'NR==FNR{a[$1];next}(!($2 in a))' exclude.txt source.txt

Alternative with grep:

grep -v -F -f exclude.txt source.txt

Use grep with care, since grep does a regex matching. You might need to add also -w option to grep (word matching)

How to remove lines contained in file 1 from file 2 if in file 2 they are prefixed?

Answers (1)

Related Questions