Reputation: 317
I've UTF-8 plain text lists of usernames, 1 per line, in list1.txt
and list2.txt
. Note, in case pertinent, that usernames may contain regex characters e.g. ! ^ . (
and such as well as spaces.
I want to get and save to matches.txt
a list of all unique values occurring in both lists. I've little command line expertise but this almost gets me there:
grep -Ff list1.txt list2.txt > matches.txt
...but that is treating "jdoe"
and "jdoe III"
as a match, returning "jdoe III"
as the matched value. This is incorrect for the task. I need the per-line pattern match to be the whole line, i.e. from ^ to $. I've tried adding the -x flag but that gets no matches at all (edit: see comment to accepted answer - I got the flag order wrong).
I'm on OS X 10.9.5 and I don't have to use grep
- another command line (tool) solving the problem will do.
Upvotes: 0
Views: 4479
Reputation: 369
A very simple and straightforward way to do it that doesn't require one to do all sorts of crazy things with grep is as follows
cat list1.txt list2.txt|grep match > matches.txtNot only that, but it's also easier to remember, (especially if you regularly use cat).
Upvotes: 0
Reputation: 16176
All you need to do is add the -x
flag to your grep
query:
grep -Fxf list1.txt list2.txt > matches.txt
The -x
flag will restrict matches to full line matches (each PATTERN
becomes ^PATTERN$
). I'm not sure why your attempt at -x
failed. Maybe you put it after the -f
, which must be immediately followed by the first file?
Upvotes: 2
Reputation: 785471
This awk
will be handy than grep
here:
awk 'FNR==NR{a[$0]; next} $0 in a' list1.txt list2.txt > matches.txt
$0
is the line, FNR
is the current line number of the current file, NR
is the overall line number (they are only the same when you are on the first file). a[$0]
is a associative array (hash) whose key is the line. next
will ensure that further clauses (the $0 in a
) will not run if the current clause (the fact that this is the first file) did. $0 in a
will be true when the current line has a value in the array a
, thus only lines present in both will be displayed. The order will be their order of occurence in the second file.
Upvotes: 1