Justin Davis
Justin Davis

Reputation: 19

Append row conditionally as column bash script

I have been attempting to write a bash script that properly formats output of a command. The output puts multiple columns as a single list of records:

host="host1"
Disk Agent="A.06.20"
General Media Agent="A.06.20"
host="host2"
Disk Agent="A.06.20"
General Media Agent="A.06.20"
host="host3"
Disk Agent="A.06.20"
host="host4"
Disk Agent="A.06.20"
General Media Agent="A.06.20"

I would like to have the script format it as such:

host="host1",Disk Agent="A.06.20",General Media Agent="A.06.20"
host="host2",Disk Agent="A.06.20",General Media Agent="A.06.20"
host="host3",Disk Agent="A.06.20",
host="host4",Disk Agent="A.06.20",General Media Agent="A.06.20"

As you can see, not every host has all 3 values, so it can't just iterate the list.

There are hundreds of hosts in my output and it's very frustrating that the command doesn't have options for creating a table or report.

The output has a bunch of other garbage in it as well that i've been able to sed out, but i'm very new to sed and awk so it's giving me a headache.

Thanks!

Upvotes: 1

Views: 119

Answers (3)

Ed Morton
Ed Morton

Reputation: 204456

Good grief. The general purpose text processing tool for UNIX is awk, just use it:

awk '
/^host/ { if (rec) print rec; rec=sep=""} 
{ rec = rec sep $0; sep="," }
END { print rec }
' file
host="host1",Disk Agent="A.06.20",General Media Agent="A.06.20"
host="host2",Disk Agent="A.06.20",General Media Agent="A.06.20"
host="host3",Disk Agent="A.06.20"
host="host4",Disk Agent="A.06.20",General Media Agent="A.06.20"

or more generally usefully, notice how this version always has the same number of comma-separated fields on each output line and handles ANY input line being missing:

$ cat file
host="host1"
Disk Agent="A.06.20"
General Media Agent="A.06.20"
host="host2"
General Media Agent="A.06.20"
host="host3"
Disk Agent="A.06.20"
host="host4"
Disk Agent="A.06.20"
General Media Agent="A.06.20"

awk '
BEGIN { FS="="; OFS="," }
/^host/ { ++numRecs }
!($1 in fld2nr) { fld2nr[$1] = ++numFlds }
{ recs[numRecs,fld2nr[$1]] = $0 }
END {
    for (recNr=1; recNr<=numRecs; recNr++) {
        for (fldNr=1; fldNr<=numFlds; fldNr++) {
            printf "%s%s", recs[recNr,fldNr], (fldNr<numFlds?OFS:ORS)
        }
    }
}
' file
host="host1",Disk Agent="A.06.20",General Media Agent="A.06.20"
host="host2",,General Media Agent="A.06.20"
host="host3",Disk Agent="A.06.20",
host="host4",Disk Agent="A.06.20",General Media Agent="A.06.20"

Upvotes: 1

bkmoney
bkmoney

Reputation: 1256

Here is a sed script:

sed '/host/{:loop; N; /\nhost/!s/\n/,/; t loop; P; D}' foo.txt

It works by matching the host, then appending the next line. If the next line was not starting with host, it substitutes the \n for a comma. The loop terminates when you reach the next "host" line. The P command prints the portion of the multiline pattern space before the \n, and the D deletes this portion and transfers control to the top of the script, so that the next "host" line becomes the current line and the script starts again.

Which outputs:

host="host1",Disk Agent="A.06.20",General Media Agent="A.06.20"
host="host2",Disk Agent="A.06.20",General Media Agent="A.06.20"
host="host3",Disk Agent="A.06.20"
host="host4",Disk Agent="A.06.20",General Media Agent="A.06.20"

Upvotes: 2

Nidhoegger
Nidhoegger

Reputation: 5232

I put together a small script that does what you want and it is very little effort needed to learn what you need.

The only thing you really need to do with awk is to cut apart the name and value of the current line, this can be achieved by using:

name=$(awk -F'=' '{print $1}' <<< $line)
value=$(awk -F'=' '{print $2}' <<< $line)

The -F parameter sets the delimiter, where the line shall be tokenized, print $1 and print $2 then print the first and second token, here the name and value

All work that is left is simply comparing strings and writing the output, here you only output stuff that is really there, so you store stuff like:

        if [ "${name}" == "host" ]; then
                output_data
                host="${value}"
        elif [ "${name}" == "Disk Agent" ]; then
                disk_agent="${value}"
        elif [ "${name}" == "General Media Agent" ]; then
                general_agent="${value}"
        fi

and output it like

        if [ -n "${host}" ]; then
                echo -n "host=${host}"
                if [ -n "${disk_agent}" ]; then
                        echo -n ",Disk Agent=${disk_agent}"
                fi
                if [ -n "${general_agent}" ]; then
                        echo -n ",General Media Agent=${general_agent}"
                fi
                echo
        fi

After outputting the values, do not forget to reset the variables to a "" string, otherwise in the next iteration, the old value may be outputted.

Upvotes: 0

Related Questions