Reputation: 2458

Compare 2 Unix Files and Output Matching Lines to a New File?

I have 2 nix files. All of the data is on one single line in each file. Each value is separated by a null character. Some off the values in the data match.

How would I parse this data into a new file listing only the matching values ?

I figure I could use sed to change the null characters into newlines ? From there on I'm not real sure...

Any ideas ?

Upvotes: 5

Answers (3)

potong

Reputation: 58478

This might work for you:

parallel 'tr "\000" "\n" <{} | sort -u' ::: file{1,2} | sort | uniq -d

Upvotes: 2

Barton Chittenden

Reputation: 4416

If there are no duplicate values within file1 or file2, you can do this:

( tr '\0' '\n' < file1; tr '\0' '\n' < file2 ) | sort | uniq -c | egrep -v '^ +1'

This will count all of the duplicate values between the two files.

If the order of the fields is important, you can do this:

comm -1 -2 <(tr '\0' '\n' < file1) <(tr '\0' '\n' < file2)

This approach is not portable, it requires the 'process substitution' feature of Bash.

Upvotes: 4

holygeek

Reputation: 16185

Use tr, sort and comm:

Convert nulls into new lines, and sort the result:

$ tr '\000' '\n' < file1 | sort > file1.txt
$ tr '\000' '\n' < file2 | sort > file2.txt

then use comm to get the lines that are common to both file:

$ comm -1 -2 file1.txt file2.txt
<lines shown here are the common lines between file1.txt and file2.txt>

Upvotes: 10

Compare 2 Unix Files and Output Matching Lines to a New File?

Answers (3)

Related Questions