Reputation: 914
I have two lists of files which I want to diff. The second list has more files in it, and because they are all in alphabetical order when I diff these two lists I get files (lines) that exists in both lists, but in a different place.
I want to diff these two lists, ignoring line place in the list. This way I would get only the new or missing lines in the list.
Thank you.
Upvotes: 17
Views: 31310
Reputation: 2894
Do the following:
cat file1 file2 | sort | uniq -u
This will give you a list of lines which are unique (ie, not duplicated).
Explanation:
1) cat file1 file2 will put all of the entries into one list
2) sort will sort the combined list
3) uniq -u will only output the entries which don't have duplicates
Upvotes: 17
Reputation: 20759
comm
command:To demonstrate, let's create two input files:
$ cat <<EOF >a
> a.txt
> b.txt
> c.txt
> EOF
$ cat <<EOF >b
> a.txt
> a1.txt
> b.txt
> b2.txt
> EOF
Now, using the comm
command to get what the question wanted:
$ comm -2 a b
a.txt
b.txt
c.txt
This shows a columnar output with missing files (lines in a
but not in b
) in the first column and extra files (lines in b
but not in a
) in the second column.
comm
do?Here's the output if the command is typed without any switches:
$ comm a b
a.txt
a1.txt
b.txt
b2.txt
c.txt
This shows three columns thus:
a
but not in b
a
and b
b
but not in a
What the numbered switches -123
do is it hides the specified column from the output.
So for example:
-13
results in common lines only-12
results in lines only in b
-23
results in lines only in a
-2
results in the symmetric difference-123
results in no outputUpvotes: 9
Reputation: 274582
You can try this approach which involves "subtracting" the two lists as follows:
$ cat file1
a.txt
b.txt
c.txt
$ cat file2
a.txt
a1.txt
b.txt
b2.txt
1) print everything in file2 that is not in file1 i.e. file2 - file1
$ grep -vxFf file1 file2
a1.txt
b2.txt
2) print everything in file1 that is not in file2 i.e. file1 - file2
$ grep -vxFf file2 file1
c.txt
(You can then do what you want with these diffs e.g. write to file, sort etc)
grep options descriptions:
-v, --invert-match select non-matching lines
-x, --line-regexp force PATTERN to match only whole lines
-F, --fixed-strings PATTERN is a set of newline-separated strings
-f, --file=FILE obtain PATTERN from FILE
Upvotes: 24
Reputation: 7841
For the example you quotes @Sparr
a
contains
a.txt
b.txt
c.txt
b
contains
a.txt
a1.txt
b.txt
b2.txt
diff a b
gives
1a2
> a1.txt
3c4
< c.txt
---
> b2.txt
What is it about this output that does not meet your needs?
Upvotes: 4
Reputation: 66709
Sorting the two list before you diff them will provide a more useful diff data.
Upvotes: 1
Reputation: 7712
If the lines are sorted, diff should catch the insertions and deletions just fine and only report the differences.
Upvotes: 0