Reputation: 359
Hey I'm trying to find the distance between records in a text file. I'm trying to do it using awk. An example input is:
1 2 1 4 yes
2 3 2 2 no
1 1 1 5 yes
4 2 4 0 no
5 1 0 1 no
I want to find the distance between each of the numerical values. I'm doing this by subtracting the values and then squaring the answer. I have tried the following code below but all the distances are simply 0. Any help would be appreciated.
BEGIN {recs = 0; fieldnum = 5;}
{
recs++;
for(i=1;i<=NF;i++) {data[recs,i] = $i;}
}
END {
for(r=1;r<=recs;r++) {
for(f=1;f<fieldnum;f++) {
##find distances
for(t=1;t<=recs;t++) {
distance[r,t]+=((data[r,f] - data[t,f])*(data[r,f] - data[t,f]));
}
}
}
for(r=1;r<=recs;r++) {
for(t=1;t<recs;t++) {
##print distances
printf("distance between %d and %d is %d \n",r,t,distance[r,t]);
}
}
}
Upvotes: 0
Views: 155
Reputation: 203995
No idea what you mean conceptually by the "distance between each of the numerical values" so I can't help you with your algorithm but let's clean up the code to see what that looks like:
$ cat tst.awk
{
for(i=1;i<=NF;i++) {
data[NR,i] = $i
}
}
END {
for(r=1;r<=NR;r++) {
for(f=1;f<NF;f++) {
##find distances
for(t=1;t<=NR;t++) {
delta = data[r,f] - data[t,f]
distance[r,t]+=(delta * delta)
}
}
}
for(r=1;r<=NR;r++) {
for(t=1;t<NR;t++) {
##print distances
printf "distance between %d and %d is %d\n",r,t,distance[r,t]
}
}
}
$
$ awk -f tst.awk file
distance between 1 and 1 is 0
distance between 1 and 2 is 7
distance between 1 and 3 is 2
distance between 1 and 4 is 34
distance between 2 and 1 is 7
distance between 2 and 2 is 0
distance between 2 and 3 is 15
distance between 2 and 4 is 13
distance between 3 and 1 is 2
distance between 3 and 2 is 15
distance between 3 and 3 is 0
distance between 3 and 4 is 44
distance between 4 and 1 is 34
distance between 4 and 2 is 13
distance between 4 and 3 is 44
distance between 4 and 4 is 0
distance between 5 and 1 is 27
distance between 5 and 2 is 18
distance between 5 and 3 is 33
distance between 5 and 4 is 19
Seems to produce some non-zero output....
Upvotes: 3