Reputation: 27
I have two files which are different row :
file 1:
31.32 29.15 46.77 106.40 11370
25.81 40.82 25.67 30.08 16365
27.11 42.32 14.48 50.04 18310.7
26.48 42.34 12.65 62.78 19607.5
24.48 46.00 17.16 11.86 22087.2
26.75 43.91 29.65 55.81 24032.7
30.91 34.85 15.25 50.93 26703
25.24 41.62 16.54 51.57 38032.9
23.48 41.97 17.33 50.88 48981.2
24.16 39.34 16.99 50.86 77513.4
22.90 41.59 19.76 50.31 135803
19.98 43.52 20.58 45.65 747049
19.96 43.64 20.43 45.37 809913
19.93 43.75 20.41 45.33 863931
and file 2:
12.4 -32.1 39.1 -44.9 135497.688
8.6 -38.6 39.3 -44.8 48981.191
1.0 -45.0 0.0 -54.0 45928.445
13.9 -70.1 39.4 -44.8 26702.982
I would like to compare these two files and the output :
file 3
13.9 -70.1 30.91 34.85 39.4 -44.8 15.25 50.93 26702.982
8.6 -38.6 23.48 41.97 39.3 -44.8 17.33 50.88 48981.191
The problem is the respective columns value in the two files are not exactly matched. But It will be fine if they match within certain error bounds (e.g., +/- 1).
Annotating where values in file 3 come from, using F/R/C for File/Row/Column:
13.9 -70.1 30.91 34.85 39.4 -44.8 15.25 50.93 26702.982
2/4/1 2/4/2 1/7/1 1/7/2 2/4/3 2/4/4 1/7/3 1/7/4 2/4/5
8.6 -38.6 23.48 41.97 39.3 -44.8 17.33 50.88 48981.191
2/2/1 2/2/2 1/9/1 1/9/2 2/2/3 2/2/4 1/9/3 1/9/4 2/2/5
But:
Upvotes: 1
Views: 595
Reputation: 63892
This:
(LC_ALL=C; join -1 5 -2 5 \
<(<file1 awk '{printf "%s %s %s %s %d\n",$1,$2,$3,$4,int($5+0.5);}' | sort -nk5)\
<(<file2 awk '{printf "%s %s %s %s %d\n",$1,$2,$3,$4,int($5+0.5);}' | sort -nk5)
) | awk '{print $2, $3, $6, $7, $4, $5, $8, $9, $1}'
will produce for your input this:
13.9 -70.1 30.91 34.85 39.4 -44.8 15.25 50.93 26703
8.6 -38.6 23.48 41.97 39.3 -44.8 17.33 50.88 48981
The last column is rounded.
more compact form:
cmd() {
awk '{printf "%s %s %s %s %d\n",$1,$2,$3,$4,int($5+0.5);}' | sort -nk5
}
(LC_ALL=C; join -1 5 -2 5 <(<file1 cmd) <(<file2 cmd)) |\
awk '{print $2, $3, $6, $7, $4, $5, $8, $9, $1}'
Upvotes: 4
Reputation: 3236
Only with awk.
awk '
NR==FNR {a[int($5+0.5)] = $0; next}
a[int($5+0.5)] {$0 = a[int($5+0.5)] " " $0; print $6,$7,$1,$2,$8,$9,$10}' file1 file2
If you need it to be sorted, pipe the output into sort
Upvotes: 0
Reputation: 246754
awk '
function close_enough(v1, v2, delta) {
delta = v1 - v2
return (-1 <= delta && delta <= 1)
}
NR == FNR {
key[$NF] = $0
next
}
{
for (val in key) {
if (close_enough($NF,val)) {
split(key[val], arr)
print arr[1], arr[2], $1, $2, arr[3], arr[4], $3, $4, val
}
}
}
' file2 file1 | column -t > file3
Upvotes: 2