Neethu Lalitha
Neethu Lalitha

Reputation: 3071

Comparing contents of two csv in Unix

I want to compare contents of two csv files using unix.The Rules of this comparison are compare the Application name of the two files if it matches compare the file_number and return success or failure message based on the comparison.ie if both the file_number matches.

First file is:

file_number Application_name

25,AWX
78,UYH
90,TGY
89,GHB

Second file is:

file_number Application_name Date message

92, AWX, 2014-12-01 , SUCCESS 
66, AWX, 2014-12-02 , SUCCESS 
3, UYH, 2014-12-01 , SUCCESS 
3, TGY, 2014-12-02 , SUCCESS 
90, TGY , 2014-12-01 , SUCCESS 
89, GHB , 2014-12-02 , SUCCESS 

My final output should be like this:

AWX , FAIL
UYH, FAIL
TGY, SUCCESS
GHB, SUCCESS

Any help?

Upvotes: 0

Views: 123

Answers (1)

jherran
jherran

Reputation: 3367

First you need to order your files.

sort input1.txt > filename1.txt
sort input2.txt | cut -f1,2,4 -d, > filename2.txt

In the second file, I removed the date, because is not necessary in the output.

$ join -a1 -j1 -t, filename1.txt filename2.txt | cut -f2,4 -d, | sort > intermediate1.txt

Join the files, keeping unpairable lines from file 1 -a1. The results concatenate fields from both files, so we only want fields 2 and 4 (cut) and then sort the output.

$ cat intermediate1.txt
AWX
GHB, SUCCESS
TGY, SUCCESS
UYH

$ cat intermediate1.txt | awk '!/SUCCESS/{print $1", FAIL"}' > intermediate2.txt

Add the string , FAIL to the lines that does not contain SUCCESS.

$ join -a1 -t, intermediate1.txt intermediate2.txt > final.txt

Join again and you have it.

$ cat final.txt 
AWX, FAIL
GHB, SUCCESS
TGY, SUCCESS
UYH, FAIL

Upvotes: 1

Related Questions