Reputation: 724
I'm trying to compare two different files, let's say "file1" and "file2", column by column Fields $1 and $2 are the same in both files, If any value is different in one column, then print columns 1 and 2 and the column number where the mismatch was found, als print the value found in the last column with line error.
file1
36829.00 37145.00 10801 36840.00 36888.00 37146.00 37576 5 1
36833.00 38033.00 21601 36840.00 36888.00 37602.00 38464 5 1
37265.00 38105.00 25921 36840.00 36900.00 37674.00 38536 6 2
37271.00 38885.00 8841 36840.00 36876.00 38454.00 38894 4 3
file2
36829.00 37145.00 10801 36840.00 36888.00 37146.00 37576 5 1
36833.00 38033.00 21601 36840.00 36888.00 37602.00 38464 3 1
37265.00 38105.00 25921 36840.00 36900.00 37674.00 38536 6 2
37271.00 38885.00 8840 36840.00 36876.00 38454.00 38894 4 3
Desired output
Mismatch in # ( # is the value of the last column in the line with error )
Mismatch in 1: 36833.00 38033.00 column 8
Mismatch in 3: 37271.00 38885.00 column 4
I tried
awk 'NR==FNR{a[$1,$2];next} ($1,$2) in a' file1 file2
Thanks in advance
Upvotes: 1
Views: 119
Reputation:
try gnu awk:
awk 'NR==FNR{r[NR]=$0;next}{x=split(r[FNR],a);for(i=3;i<=9;i++){if($i!=a[i]) print "Mismatch in "a[9]": "$1,$2" column "i}}' file1 file2
Upvotes: 1
Reputation: 37404
Here's one matching on files' record numbers:
awk '
NR == FNR {
a[FNR] = $0 # match on FNR, you could use a[$1, $2]
next
}
{
n = split(a[FNR], b, FS)
for (i = 3; i <= n; i++) {
if (b[i] != $i) {
printf "Mismatch in %d: %s %s column %d\n", FNR, $1, $2, i
} # for 0 starting record numbering use FNR-1 above
}
}' file1 file2
Output:
Mismatch in 2: 36833.00 38033.00 column 8
Mismatch in 4: 37271.00 38885.00 column 3
Upvotes: 3
Reputation: 133478
Could you please try following(if I got it your question correctly, based on your samples only). This should take care of multiple mismatches on a single line too(lets say there are 3rd and 5th columns mismatches in a line then it will print both of them).
awk '
FNR==NR{
a[FNR]=$0
b[FNR]=$1 OFS $2
next
}
{
num=split(a[FNR],array," ")
for(i=3;i<=num;i++){
if($i!=array[i]){
val=(val?val ",":"")i
}
}
if(val){
print "Mismatch in line" FNR": " b[FNR]" column(s) "val
val=""
}
}' Input_file1 Input_file2
Output will be as follows.
Mismatch in line2: 36833.00 38033.00 column(s) 8
Mismatch in line4: 37271.00 38885.00 column(s) 3
Upvotes: 2