Reputation: 41
I have a file format like the example below showing the relationships between 5 individuals including themselves.
1 1 1.0
2 1 0.5
3 1 0.1
4 1 0.3
5 1 0.1
2 2 1.0
3 2 0.5
4 2 0.2
5 2 0.3
3 3 1.0
4 3 0.5
5 3 0.3
4 4 1.0
5 4 0.1
5 5 1.0
I would like to use AWK to convert it into a full matrix format. I would be necessary to have the rows and columns sorted numerically as in the example.
1 2 3 4 5
1 1.0 0.5 0.1 0.3 0.1
2 0.5 1.0 0.5 0.2 0.3
3 0.1 0.5 1.0 0.5 0.3
4 0.3 0.2 0.5 1.0 0.1
5 0.1 0.3 0.3 0.1 1.0
I came across a previous thread (below) but the format of the input file is slightly different and i am struggling to adjust it. http://www.unix.com/shell-programming-and-scripting/203483-how-rearrange-matrix-awk.html
How can I perform this transformation?
Upvotes: 2
Views: 566
Reputation: 47109
As the upper and lower triangle are identical, would it not be enough to copy each element-pair to both indices in a multi-dimensional array, e.g.:
parse.awk
{ h[$1,$2] = h[$2,$1] = $3 }
END {
for(i=1; i<=$1; i++) {
for(j=1; j<=$2; j++)
printf h[i,j] OFS
printf "\n"
}
}
Run:
awk -f parse infile
Output:
1.0 0.5 0.1 0.3 0.1
0.5 1.0 0.5 0.2 0.3
0.1 0.5 1.0 0.5 0.3
0.3 0.2 0.5 1.0 0.1
0.1 0.3 0.3 0.1 1.0
Note, that this assumes the last line holds the largest indices.
Upvotes: 1
Reputation: 92854
Here we go, gawk solution:
matrixize.awk script:
#!/bin/awk -f
BEGIN { OFS="\t" } # output field separator
{
b[$1]; # accumulating unique indices
if ($1 != $2) {
a[$2][$1] = $3 # set `diagonal` relation between different indices
}
a[$1][$2] = $3 # multidimensional array (reflects relation `one-to-many`)
}
END {
asorti(b); h = ""; # sort unique indices
for (i in b) {
h = h OFS i # form header columns
}
print h; # print header column values
for (i in b) {
row = i; # index column
# iterating through the row values (for each intersection point)
for (j in a[i]) {
row = row OFS a[i][j]
}
print row
}
}
Usage:
awk -f matrixize.awk yourfile
The output:
1 2 3 4 5
1 1.0 0.5 0.1 0.3 0.1
2 0.5 1.0 0.5 0.2 0.3
3 0.1 0.5 1.0 0.5 0.3
4 0.3 0.2 0.5 1.0 0.1
5 0.1 0.3 0.3 0.1 1.0
Upvotes: 2