theuniverseisflat
theuniverseisflat

Reputation: 881

Use awk sed command and while loop to remove entries from second file

I have two output files:

  1. FILE-A contains 70,000+ unique entries.
  2. FILE-B contains a unique listing that I need to remove from FILE-B.

FILE-A:

 TOM
 JACK
 AILEY
 BORG
 ROSE
 ELI

FILE-B Content:

 TOM
 ELI

I want to remove anything listed in FILE-B from File-A.

FILE-C (Result file):

 JACK
 AILEY
 BORG
 ROSE

I assume I need a while r for i statement. Can someone help me with this? I need to cat and read FILE-A and for every line in FILE-B I need to remove that from FILE-A.

What command should I use?

Upvotes: 3

Views: 770

Answers (4)

Jahid
Jahid

Reputation: 22428

You don't need any loop, single awk or sed command is enough:

awk:

awk 'FNR==NR {a[$0];next} !($0 in a)' FILE-B FILE-A >FILE-C

sed:

sed "s=^=/^=;s=$=$/d=" FILE-B | sed -f- FILE-A >FILE-C

Note:

  1. While the sed version works for the data shown, it won't handle any text in FILE-B which can be interpreted as a regex pattern.
  2. The awk solution reads FILE-B entirely into memory. It doesn't have the limitation of interpreting text as like the sed solution.

Upvotes: 1

anubhava
anubhava

Reputation: 785176

You can use grep -v -f:

grep -xFvf FILE-B FILE-A
ACK
AILEY
BORG
ROSE

Upvotes: 4

karakfa
karakfa

Reputation: 67507

If you start with sorted input, the tool for this task is comm

comm -23 FILE-A FILE-B

the option argument means

-2              suppress lines unique to FILE-B
-3              suppress lines that appear in both files

if not sorted initially, you can do the following

comm -23 <(sort FILE-A) <(sort FILE-B)

Upvotes: 1

lcd047
lcd047

Reputation: 5861

You don't need either awk, sed, or a loop. You just need grep:

fgrep -vxf FILE-B FILE-A

Please note the use of -x to match entries exactly.

Output:

JACK
AILEY
BORG
ROSE

Upvotes: 5

Related Questions