Reputation: 11

diff 2 files with an output that does not include extra lines

I have 2 files test and test1 and I would like to do a diff between them without the output having extra characters 2a3, 4a6, 6a9 as shown below.

mangoes
apples
banana
peach
mango
strawberry

test1:

mangoes
apples
blueberries
banana
peach
blackberries
mango
strawberry
star fruit

when I diff both the files

$ diff test test1
2a3
> blueberries
4a6
> blackberries
6a9
> star fruit

How do I get the output as

$ diff test test1
blueberries
blackberries
star fruit

Upvotes: 1

Answers (3)

dawg

Reputation: 104092

You can use grep to filter out lines that are not different text:

$ diff file1 file2 | grep '^[<>]'
> blueberries
> blackberries
> star fruit

If you want to remove the direction indicators that indicate which file differs, use sed:

$ diff file1 file2 | sed -n 's/^[<>] //p'
blueberries
blackberries
star fruit

(But it may be confusing to not see which file differs...)

Upvotes: 2

Juan Diego Godoy Robles

Reputation: 14975

A solution using comm:

comm -13 <(sort test) <(sort test1)

Explanation

comm - compare two sorted files line by line

With no options, produce three-column output. Column one contains lines unique to FILE1, column two contains lines unique to FILE2, and column three contains lines common to both files.

-1 suppress column 1 (lines unique to FILE1)

-2 suppress column 2 (lines unique to FILE2)

-3 suppress column 3 (lines that appear in both files

As we only need the lines unique to the second file test1, -13 is used to suppress the unwanted columns.

Process Substitution is used to get the sorted files.

Upvotes: 2

oguz ismail

Reputation: 50805

You can use awk

awk 'NR==FNR{a[$0];next} !($0 in a)' test test1

NR==FNR means currently first file on the command line (i.e. test) is being processed,
a[$0] keeps each record in array named a,
next means read next line without doing anything else,
!($0 in a) means if current line does not exist in a, print it.

Upvotes: 1

diff 2 files with an output that does not include extra lines

Answers (3)

Related Questions