a3kartik
a3kartik

Reputation: 105

How to eliminate lines from a file while comparing two files

I have two files, I need an output file which contains everything that is not in the first file but is in the second file, the second file contains everything that is in the first file with some more entries. I tried:

for j in `cat first`; do sed '/"$j"/d' second; done
cat first 
a
b
c
d
e
f
# cat second
a
1
b
22
33
c
44
d
11
e
44
f

Upvotes: 1

Views: 113

Answers (4)

RARE Kpop Manifesto
RARE Kpop Manifesto

Reputation: 2819

UPDATE 1 : ultra truncated edition

mawk ' NR==FNR { __[$_] } NF -= $_ in __' FS='^$' 
                                          test_first_file.txt 
                                          test_second_file.txt
    1
    22
    33
    44
    11
    44

————————————————————————————————

[m/n/g]awk '
BEGIN { FS="^$" } NR==1 { 
   do { __[$-_] } while ((getline)<=(FNR==NR))

} ($-_ in __)!=!___[$-_]-- ' test_first_file.txt test_second_file.txt

————————————————————————————————

1
22
33
44
11

Upvotes: 0

anubhava
anubhava

Reputation: 784958

Converting my comment to answer so that solution is easy to find for future visitors.

You may use this grep:

grep -vFxf first second

1
22
33
44
11

Options are:

  • -v: Selected lines are those not matching any of the specified patterns
  • -F: Fixed string search
  • -x: Exact match
  • -f: Use a file for patterns

Upvotes: 4

Lurvas777
Lurvas777

Reputation: 35

I prefer the answer from @anubhava, it's great for scripting. However, if you'd just like visual aid to see the difference between two files the good old diff command can be a great help.

$  diff -y first second
a                               a
                                  > 1
b                               b
                                  > 22
                                  > 33
c                               c
                                  > 44
d                               d
                                  > 11
e                               e
                                  > 44
f                               f

-y, or --side-by-side, output in two columns.

I've seen this great one as well (full credit to @Kent):

$ awk 'NR==FNR{a[$1]++;next;}!($0 in a)' first second
1
22
33
44
11
44

There's more commands like these:

  • colordiff - like diff but with color
  • cmp - compare files bytewise
  • vimdiff - diff using the vim editor

There's probably lots of other great ways to do this, these are just some of the ways.

Upvotes: 1

glenn jackman
glenn jackman

Reputation: 246754

@anubhava's comment is a great answer.

With comm, ignore what unique to first, and ignore what's common

comm --nocheck-order -13 first second

There's a straightforward solution too.

Upvotes: 0

Related Questions