Samurai
Samurai

Reputation: 139

how to skip blank space if value is not there and print proper row and column

I have one details.txt file which has below data

size=190000
date=1603278566981
repo-name=testupload
repo-path=/home/test/testupload
size=140000
date=1603278566981
repo-name=testupload2
repo-path=/home/test/testupload2
size=170000
date=1603278566981
repo-name=testupload3
repo-path=/home/test/testupload3

and below awk script process that to

#!/bin/bash
awk -vOFS='\t' '
BEGIN{ FS="=" }
/^size/{
  if(++count1==1){ header=$1"," }
  sizeArr[++count]=$NF
  next
}
/^@repo-name/{
  if(++count2==1){ header=header OFS $1"," }
  repoNameArr[count]=$NF
  next
}
/^date/{
  if(++count3==1){ header=header OFS $1"," }
  dateArr[count]=$NF
  next
  }
/^@blob-name/{
  if(++count4==1){ header=header OFS $1"," }
  repopathArr[count]=$NF
  next
}
END{
  print header
  for(i=1;i<=count;i++){
    printf("%s,%s,%s,%s,%s\n",sizeArr[i],repoNameArr[i],dateArr[i],repopathArr[i])
  }
}
' details.txt | tr -d @ |awk -F, '{$3=substr($3,0,10)}1' OFS=,|sed 's/date/creationTime/g'

which prints value as expected, (because it has reponame)

size    "   repo-name"  "   creationTime"   "   blob-name"
10496000    testupload  Fri 11 Dec 2020 07:35:56 AM CET testfile.tar11.gz
10496000    testupload  Thu 10 Dec 2020 02:44:04 PM CET testfile.tar.gz
9602303     testupload  Fri 11 Dec 2020 07:38:58 AM CET apache-maven-3.6.3-bin/apache-maven-3.6.3-bin.zip

but when something is missing in file format of file gets wrong format (here repo name jumps to last column's headers as first few data don't have reponame value)

size    "   creationTimeime"    "   blob-name"  "       "   repo-name"
261304      Thu 13 Feb 2020 08:50:02 AM CET temp    8963d25231b
29639       Thu 13 Feb 2020 08:50:00 AM CET temp    3780c72cab5
93699       Thu 13 Feb 2020 08:50:00 AM CET temp    209276c91ba

and column headers gets wrongly printed but data gets printed perfectly, is there any thing that validate if one of the field is not there it should skip that and print the rest in proper format.

If data is not available it should keep that header same, it should not headers sequence.

My requirement if deatils.txt is missing any records it should skip that and print as blank and prints as per header. Headers gets disturbed if repo-name field is not there but rest output is correct so we need to have headers intact even if field is missing.

Wrong: size " creationTimeime" " blob-name" " " repo-name" 261304 Thu 13 Feb 2020 08:50:02 AM CET temp 8963d25231b 29639 Thu 13 Feb 2020 08:50:00 AM CET temp 3780c72cab5 93699 Thu 13 Feb 2020 08:50:00 AM CET temp 209276c91ba

Right

size    "   repo-name"  "   creationTime"   "   blob-name"
    10496000    testupload  Fri 11 Dec 2020 07:35:56 AM CET testfile.tar11.gz
    10496000    testupload  Thu 10 Dec 2020 02:44:04 PM CET testfile.tar.gz
    9602303     testupload  Fri 11 Dec 2020 07:38:58 AM CET apache-maven-3.6.3-bin/apache-maven-3.6.3-bin.zip

Thanks

samurai

Upvotes: 1

Views: 113

Answers (1)

anubhava
anubhava

Reputation: 785156

You may try this gnu awk:

awk -F= -v OFS='\t' 'function prt(ind, name, s) {s=map[ind][name]; return (s==""?" ":s);} {map[NR][$1] = $2} END {print "Size", "Repo Name", "CreationTime", "Repo Path"; for (i=1; i<=NR; i+=4) print prt(i, "size"), prt(i+2, "repo-name"), prt(i+1, "date"), prt(i+3, "repo-path")}' file

Size    Repo Name    CreationTime   Repo Path
190000  testupload   1603278566981  /home/test/testupload
140000  testupload2  1603278566981  /home/test/testupload2
170000  testupload3  1603278566981  /home/test/testupload3

To make it readable:

awk -F= -v OFS='\t' 'function prt(ind, name, s) {
   s = map[ind][name]
   return (s==""?" ":s)
}
{
   map[NR][$1] = $2
}
END {
   print "Size", "Repo Name", "CreationTime", "Repo Path"
   for (i=1; i<=NR; i+=4)
      print prt(i, "size"), prt(i+2, "repo-name"), prt(i+1, "date"), prt(i+3, "repo-path")
}' file 

Upvotes: 1

Related Questions