Reputation: 23
EDIT The number of lines and tab-separate values is also dynamic as it can change. So it might be 1-5 or 1-10 with the same layout but the region will only be listed once.
I have a file with in the following format:(@TSV)
host1 host2 host3
id1 id2 id3
ip1 ip2 ip3
name1 name2 name3
role1 role2 role3
region
I can also format the file like:
host1
host2
host3
id1
id2
id3
ip1
ip2
ip3
name1
name2
name3
role1
role2
role3
region
I would like to write a new file or modify this file inline so the file is in this format: (tsv)
host1 id1 ip1 name1 role1 region
host2 id2 ip2 name2 role2 region
host3 id3 ip3 name3 role3 region
I have tried without success to use awk, sed, for loops... I need some fresh ideas.
Upvotes: 2
Views: 199
Reputation: 204731
The idiomatic awk approach to transposing rows to columns:
$ cat tst.awk
BEGIN { FS=OFS="\t" }
{
numCols = NR
numRows = (NF>numRows ? NF : numRows)
for (rowNr=1; rowNr<=NF; rowNr++) {
vals[rowNr,numCols] = $rowNr
}
}
END {
for (rowNr=1; rowNr<=numRows; rowNr++) {
for (colNr=1; colNr<=numCols; colNr++) {
val = ((rowNr,colNr) in vals ? vals[rowNr,colNr] : vals[1,colNr])
printf "%s%s", val, (colNr<numCols ? OFS : ORS)
}
}
}
$ awk -f tst.awk file
host1 id1 ip1 name1 role1 region
host2 id2 ip2 name2 role2 region
host3 id3 ip3 name3 role3 region
The above was run on your first input file:
$ cat file
host1 host2 host3
id1 id2 id3
ip1 ip2 ip3
name1 name2 name3
role1 role2 role3
region
Note the script makes no reference to any values in your input, nor how many rows or columns you have nor any other assumptions about the content of your input file except that if values are missing you want the first one repeated.
Upvotes: 1
Reputation: 67567
Starting with the list formatted version, if you had no missing data, i.e. "religion" 3 times, it would be much easier.
You can add the missing values on the fly and then simply pr
$ awk '1; END{print;print}' file | pr -6ts
host1 id1 ip1 name1 role1 region
host2 id2 ip2 name2 role2 region
host3 id3 ip3 name3 role3 region
if the number of columns are known and only the last values might be missing, you can parametrize by number of columns
$ cols=6; awk -v cols=$cols '1; END{for(i=1;i<=(NR-cols)/(cols-1);i++) print}' file |
pr -${cols}ts
Upvotes: 0
Reputation: 158280
You can use the following awk
script:
# translate.awk
NR==1 {
split($0,hosts)
}
NR==2 {
split($0,ids)
}
NR==3{
split($0,ips)
}
NR==4{
split($0,names)
}
NR==5{
split($0,roles)
}
NR==6{
region=$1
}
END{
OFS="\t"
for(i in hosts) {
print hosts[i], ids[i], ips[i], names[i], roles[i], region
}
}
Call it like this:
awk -f translate.awk input.file
Upvotes: 0