Reputation: 3
Ok so I'm still learning the command line stuff like grep and diff and their uses within the scope of my project, but I can't seem to wrap my head around how to approach this problem.
So I have 2 files, each containing hundreds of 20 character long strings. lets call the files A and B. I want to search through A and, using the values in B as keys, locate UNIQUE String entries that occur in A but not in B(there are duplicates so unique is the key here)
Any Ideas?
Also I'm not opposed to finding the answer myself, but I don't have a good enough understanding of the different command line scripts and their functions to really start thinking of how to use them together.
Upvotes: 0
Views: 1269
Reputation: 942
There are two ways to do this. With comm
or with grep
, sort
, and uniq
.
comm
comm afile bfile
comm
compares the files and outputs 3 columns, lines only in afile
, lines only in bfile
, and lines in common. The -1
, -3
switches tell comm
to not print out those columns.
grep
sort
uniq
grep -F -v -file bfile afile | sort | uniq
or just
grep -F -v -file bfile afile | sort -u
if your sort
handles the -u
option.
(note: the command fgrep
if your system has it, is equivalent to grep -F
.)
Upvotes: 1
Reputation: 753615
Look up the comm
command (POSIX comm
) to do this. See also Unix command to find lines common in two files.
Upvotes: 1