Gaman99
Gaman99

Reputation: 13

Compare 2 csv files using shellscript and output the difference in a new csv file

(say) I have 2 csv files - file1.csv & file2.csv as mentioned below:

**file1.csv**       
ID,version,cost                        
1000,1,30     
2000,2,40     
3000,3,50     
4000,4,60  



**file2.csv**      
ID,version,cost     
1000,1,30       
2000,2,45     
3000,4,55   
6000,5,70   

The expected o/p is a new csv file say - file3.csv should contain the details of IDs which are present in both the files but with some of the data related to it being different (here version and cost) - as shown below.

ID,field,old,new  
2000,cost,40,45     
3000,version,3,4    
3000,cost,50,55 

I need a unix bash/ksh script for doing this. I tried with sed/grep but could not get the proper o/p. Please help me out.

Upvotes: 0

Views: 7160

Answers (1)

Ansgar Wiechers
Ansgar Wiechers

Reputation: 200213

Not the most elegant of solutions, but it should work:

#!/usr/bin/ksh

csv1=file1.csv
csv2=file2.csv

echo "ID,field,old,new"
while read line; do
  grep "$line" "$csv2" >/dev/null && continue
  id=`echo "$line" | cut -d, -f1`
  grep "^$id," "$csv2" >/dev/null || continue
  version1=`echo "$line" | cut -d, -f2`
  cost1=`echo "$line" | cut -d, -f3`
  version2=`grep "^$id," "$csv2" | cut -d, -f2`
  cost2=`grep "^$id," "$csv2" | cut -d, -f3`
  if [ "$version1" != "$version2" ]; then
    echo "$id,version,$version1,$version2"
  fi
  if [ "$cost1" != "$cost2" ]; then
    echo "$id,cost,$cost1,$cost2"
  fi
done < "$csv1"

Upvotes: 3

Related Questions