Reputation: 105

How to eliminate lines from a file while comparing two files

I have two files, I need an output file which contains everything that is not in the first file but is in the second file, the second file contains everything that is in the first file with some more entries. I tried:

for j in `cat first`; do sed '/"$j"/d' second; done

cat first 
a
b
c
d
e
f
# cat second
a
1
b
22
33
c
44
d
11
e
44
f

Upvotes: 1

Answers (4)

RARE Kpop Manifesto

Reputation: 2915

UPDATE 1 : ultra truncated edition

mawk ' NR==FNR { __[$_] } NF -= $_ in __' FS='^$' 
                                          test_first_file.txt 
                                          test_second_file.txt
    1
    22
    33
    44
    11
    44

————————————————————————————————

[m/n/g]awk '
BEGIN { FS="^$" } NR==1 { 
   do { __[$-_] } while ((getline)<=(FNR==NR))

} ($-_ in __)!=!___[$-_]-- ' test_first_file.txt test_second_file.txt

————————————————————————————————

Upvotes: 0

anubhava

Reputation: 786091

Converting my comment to answer so that solution is easy to find for future visitors.

You may use this grep:

grep -vFxf first second

1
22
33
44
11

Options are:

-v: Selected lines are those not matching any of the specified patterns
-F: Fixed string search
-x: Exact match
-f: Use a file for patterns

Upvotes: 4

Lurvas777

Reputation: 35

I prefer the answer from @anubhava, it's great for scripting. However, if you'd just like visual aid to see the difference between two files the good old diff command can be a great help.

$  diff -y first second
a                               a
                                  > 1
b                               b
                                  > 22
                                  > 33
c                               c
                                  > 44
d                               d
                                  > 11
e                               e
                                  > 44
f                               f

-y, or --side-by-side, output in two columns.

I've seen this great one as well (full credit to @Kent):

$ awk 'NR==FNR{a[$1]++;next;}!($0 in a)' first second
1
22
33
44
11
44

There's more commands like these:

colordiff - like diff but with color
cmp - compare files bytewise
vimdiff - diff using the vim editor

There's probably lots of other great ways to do this, these are just some of the ways.

Upvotes: 1

glenn jackman

Reputation: 247192

@anubhava's comment is a great answer.

With comm, ignore what unique to first, and ignore what's common

comm --nocheck-order -13 first second

There's a straightforward awk solution too.

Upvotes: 0

How to eliminate lines from a file while comparing two files

Answers (4)

Related Questions