skeenster
skeenster

Reputation: 11

Multiple awk output to csv columns

I have a number of commands that all run well independently. The 5 below awk commands currently output the required data, separated by commas. I would like to combine all of below awk outputs so they are presented in a single csv file but I am not having much luck.

The below commands simply search for lines containing particular words and the output of required fields are extracted. For example.. I would like $4,$5,$6,$7,$8,$9 from first command to fill first 6 columns followed by output of $2,$4 from second command in the next 2 columns.

awk '/start/ { print $4,$5,$6,$7,$8,$9 }' 07.08.2014.txt | sed -e "s/ /,/g"
awk '/PacketLoss/ { print $2,$4 }' 07.08.2014.txt | sed -e "s/ /,/g"
awk '/PacketOutOfSequence/ { print $2,$4,$6 }' 07.08.2014.txt | sed -e "s/ /,/g"
awk '/JitterSD/ { print $3,$6,$9 }' 07.08.2014.txt | sed -e "s/ /,/g"
awk '/NumOfRTT/ { print $2,$4,$6,$8 }' 07.08.2014.txt | sed -e "s/ /,/g" 

Example input data

Aggregation start time 11:45:47.893 BST Thu Aug 7 2014
NumOfRTT: 360000        RTTAvg: 145     RTTMin: 144     RTTMax: 171
PacketLossSD: 0 PacketLossDS: 0
PacketOutOfSequence: 3  PacketMIA: 0    PacketLateArrival: 0
Jitter Avg: 1   JitterSD Avg: 1 JitterDS Avg: 1

Example output

11:45:47.893,BST,Thu,Aug,7,2014,0,0,3,0,0,1,1,1,360000,145,144,171

It would also be nice to label each column as below but that isn't critical if too complicated as I can do it manually

START_TIME,BST,DAY,MONTH,DATE,YEAR,PacketLossSD,PacketLossDS,PacketOutOfSeq,PacketMIA,PacketLateArrival,JitterAvg,JitterSD_Avg,JitterSD_Avg,NumOfRTT,RTTAvg,RTTMin,RTTMax
11:45:47.893,BST,Thu,Aug,7,2014,0,0,3,0,0,1,1,1,360000,145,144,171

Thanks in advance for any assistance :)

Upvotes: 1

Views: 1062

Answers (1)

ooga
ooga

Reputation: 15501

awk '
  NR==1 { print "START_TIME,BST,DAY,MONTH,DATE,YEAR,PacketLossSD,PacketLossDS,PacketOutOfSeq,PacketMIA,PacketLateArrival,JitterAvg,JitterSD_Avg,JitterSD_Avg,NumOfRTT,RTTAvg,RTTMin,RTTMax" }
  /start/               { tm = sprintf("%s,%s,%s,%s,%s,%s",$4,$5,$6,$7,$8,$9) }
  /PacketLoss/          { pl  = sprintf("%s,%s",$2,$4) }
  /PacketOutOfSequence/ { pos = sprintf("%s,%s,%s",$2,$4,$6) }
  /NumOfRTT/            { rtt = sprintf("%s,%s,%s,%s",$2,$4,$6,$8) }
  /JitterSD/            { printf("%s,%s,%s,%s,%s,%s,%s\n",tm,pl,pos,$3,$6,$9,rtt) }
' 07.08.2014.txt

The idea is to save the data in strings as the lines are read and only print it out when the last line (assumed to be the one containing "JitterSD") is read.

Alternate idea:

awk '
  BEGIN { RS=""; FS="\n"; OFS="," }
  {
    split($1,a," "); L1=a[4]","a[5]","a[6]","a[7]","a[8]","a[9]
    split($2,a," "); L2=a[2]","a[4]","a[6]","a[8]
    split($3,a," "); L3=a[2]","a[4]
    split($4,a," "); L4=a[2]","a[4]","a[6]
    split($5,a," "); L5=a[3]","a[6]","a[9]
    print L1, L3, L4, L5, L2
  }
' 07.08.2014.txt

RS="" is a special setting that splits records on one or more blank lines, used for multiline records.

FS="\n" will set $1, $2, etc, to line 1, line 2, etc of the multiline record.

split($1,a," ") splits the line into fields separated by spaces, putting them in the array a.

Upvotes: 4

Related Questions