Reputation: 139
I have one details.txt file which has below data
size=190000
date=1603278566981
repo-name=testupload
repo-path=/home/test/testupload
size=140000
date=1603278566981
repo-name=testupload2
repo-path=/home/test/testupload2
size=170000
date=1603278566981
repo-name=testupload3
repo-path=/home/test/testupload3
and below awk script process that to
#!/bin/bash
awk -vOFS='\t' '
BEGIN{ FS="=" }
/^size/{
if(++count1==1){ header=$1"," }
sizeArr[++count]=$NF
next
}
/^@repo-name/{
if(++count2==1){ header=header OFS $1"," }
repoNameArr[count]=$NF
next
}
/^date/{
if(++count3==1){ header=header OFS $1"," }
dateArr[count]=$NF
next
}
/^@blob-name/{
if(++count4==1){ header=header OFS $1"," }
repopathArr[count]=$NF
next
}
END{
print header
for(i=1;i<=count;i++){
printf("%s,%s,%s,%s,%s\n",sizeArr[i],repoNameArr[i],dateArr[i],repopathArr[i])
}
}
' details.txt | tr -d @ |awk -F, '{$3=substr($3,0,10)}1' OFS=,|sed 's/date/creationTime/g'
which prints value as expected, (because it has reponame)
size " repo-name" " creationTime" " blob-name"
10496000 testupload Fri 11 Dec 2020 07:35:56 AM CET testfile.tar11.gz
10496000 testupload Thu 10 Dec 2020 02:44:04 PM CET testfile.tar.gz
9602303 testupload Fri 11 Dec 2020 07:38:58 AM CET apache-maven-3.6.3-bin/apache-maven-3.6.3-bin.zip
but when something is missing in file format of file gets wrong format (here repo name jumps to last column's headers as first few data don't have reponame value)
size " creationTimeime" " blob-name" " " repo-name"
261304 Thu 13 Feb 2020 08:50:02 AM CET temp 8963d25231b
29639 Thu 13 Feb 2020 08:50:00 AM CET temp 3780c72cab5
93699 Thu 13 Feb 2020 08:50:00 AM CET temp 209276c91ba
and column headers gets wrongly printed but data gets printed perfectly, is there any thing that validate if one of the field is not there it should skip that and print the rest in proper format.
If data is not available it should keep that header same, it should not headers sequence.
My requirement if deatils.txt is missing any records it should skip that and print as blank and prints as per header. Headers gets disturbed if repo-name field is not there but rest output is correct so we need to have headers intact even if field is missing.
Wrong: size " creationTimeime" " blob-name" " " repo-name" 261304 Thu 13 Feb 2020 08:50:02 AM CET temp 8963d25231b 29639 Thu 13 Feb 2020 08:50:00 AM CET temp 3780c72cab5 93699 Thu 13 Feb 2020 08:50:00 AM CET temp 209276c91ba
Right
size " repo-name" " creationTime" " blob-name"
10496000 testupload Fri 11 Dec 2020 07:35:56 AM CET testfile.tar11.gz
10496000 testupload Thu 10 Dec 2020 02:44:04 PM CET testfile.tar.gz
9602303 testupload Fri 11 Dec 2020 07:38:58 AM CET apache-maven-3.6.3-bin/apache-maven-3.6.3-bin.zip
Thanks
samurai
Upvotes: 1
Views: 113
Reputation: 785156
You may try this gnu awk
:
awk -F= -v OFS='\t' 'function prt(ind, name, s) {s=map[ind][name]; return (s==""?" ":s);} {map[NR][$1] = $2} END {print "Size", "Repo Name", "CreationTime", "Repo Path"; for (i=1; i<=NR; i+=4) print prt(i, "size"), prt(i+2, "repo-name"), prt(i+1, "date"), prt(i+3, "repo-path")}' file
Size Repo Name CreationTime Repo Path
190000 testupload 1603278566981 /home/test/testupload
140000 testupload2 1603278566981 /home/test/testupload2
170000 testupload3 1603278566981 /home/test/testupload3
To make it readable:
awk -F= -v OFS='\t' 'function prt(ind, name, s) {
s = map[ind][name]
return (s==""?" ":s)
}
{
map[NR][$1] = $2
}
END {
print "Size", "Repo Name", "CreationTime", "Repo Path"
for (i=1; i<=NR; i+=4)
print prt(i, "size"), prt(i+2, "repo-name"), prt(i+1, "date"), prt(i+3, "repo-path")
}' file
Upvotes: 1