Compare csv file

Question

I'm comparing two CSV files which have two columns: file name, and the hash of the file. I need to find out the file which has a mismatching hash and a new file from first.csv which does not exist in the second.csv. I want to output such file names, like BlockchainContextHelp.tsv and new1.tsv

first.csv
#File,SHA-1
BlockchainContextHelp.tsv,1234562eertyrtyty3rer
new.tsv,7777hhrtdk12kefk23kfmsd

second.csv
#File,SHA-1
BlockchainContextHelp.tsv,123522234rrtkoe98877
new.tsv,7777hhrtdk12kefk23kfmsd
new1.tsv,3456734dfkekeruer7ererj

Below is what I tried so far.

#!/bin/bash
while IFS="," read f1 f2;do
        while IFS="," read c1 c2;do
                if [ $f2 != $c2 ]
                then
                        echo "$f1"
                fi
        done < second.csv
done < first.csv

Appreciate any suggestion.

anubhava · Accepted Answer

awk is better tool for text processing. You may use:

awk 'BEGIN {
   FS = OFS = ","                # set input/output field separator as ,
}
NR == FNR {                      # While processing the first file
   map[$1] = $2                  # store the second column by the first
   next                          # move to next record
}
!($1 in map) || map[$1] != $2 {  # In 2nd file if $1 not in map and 2nd 
                                 # column of second file is not same as 
                                 # what is in map
   print $1                      # print first column
}' first.csv second.csv

BlockchainContextHelp.tsv
new1.tsv

Compare csv file

Answers (1)

Related Questions