Reputation: 77
I have a file in the format below. Can anyone convert it in columns? I have tried the awk command below but it creates more that 4 columns if one customer has multiple hostnames.
awk '/"customer_name":/{if (x)print x;x="";}{x=(!x)?$0:x","$0;}END{print x;}' filename
Input:
customer_name: "abc"
"HostName": "tm-1"
"LastDayRxBytes": 0
"Status": "offline"
"HostName": "tm-2"
"LastDayRxBytes": 0
"Status": "offline"
"HostName": "tm-3"
"LastDayRxBytes": 0
"Status": "offline"
"HostName": "new-va-threat-01"
"LastDayRxBytes": 0
"Status": "offline"
customer_name: "xyz"
"HostName": "tm-56"
"LastDayRxBytes": 10708747
"Status": "ok"
customer_name: "def"
customer_name: "uvw"
"HostName": "tm-23"
"LastDayRxBytes": 34921829912
"Status": "ok"
customer_name: "new cust"
"HostName": "tm-1-3"
"LastDayRxBytes": 33993187093
"Status": "ok"
customer_name: "a12 d32 ffg"
customer_name: "bcd abc"
customer_name: "mno opq"
customer_name: "abc dhg pvt ltd."
"HostName": "tm-10"
"LastDayRxBytes": 145774401010
"Status": "ok"
"HostName": "tm-ngtm-13"
"LastDayRxBytes": 150159680874
"Status": "ok"
"HostName": "new-ngtm-11"
"LastDayRxBytes": 207392526747
"Status": "ok"
"HostName": "old-ngtm-06"
"LastDayRxBytes": 17708734533
"Status": "ok"
"HostName": "tm-08"
"LastDayRxBytes": 559289251
"Status": "ok"
"HostName": "tm-12"
"LastDayRxBytes": 534145552271
"Status": "ok"
I want it to be printed in column and rows as:
Column 1 Column 2 Column 3 Column 4
CustName Host Last RX Status
abc tm-1 0 offline
abc tm-2 0 offline
abc tm-3 0 offline
abc new-va-threat-01 0 offline
xyz tm-56 10708747 ok
def
uvw tm-23 34921829912 ok
new_cust tm-1-3 33993187093 ok
a12 d32 ffg
acd abc
mno opq
abc dhg pvt ltd. tm-10 145774401010 ok
abc dhg pvt ltd. tm-ngtm-13 150159680874 ok
abc dhg pvt ltd. new-ngtm-11 207392526747 ok
abc dhg pvt ltd. old-ngtm-06 17708734533 ok
abc dhg pvt ltd. tm-08 559289251 ok
abc dhg pvt ltd. tm-12 534145552271 ok
Upvotes: 0
Views: 81
Reputation: 2882
gnu awk solution:
$ cat tst.awk
BEGIN {
RS="customer_name: "
pr("Column1", "Column2", "Column3", "Column4")
pr("Custname", "Host", "Last RX", "Status")
}
match($0, /"([^"]+)"/, cust) {
printed=0
str=substr($0, RLENGTH+2)
while (match( str, /"HostName":\s"([^"]+)"\s+"LastDayRxBytes":\s(\S+)\s+"Status":\s"([^"]+)"\s/, col)){
str=substr(str, RLENGTH+3)
pr( cust[1], col[1], col[2], col[3] )
printed=1
}
if (!printed) pr(cust[1])
}
function pr(cust,host,rx,status) {
printf "%-16s\t%-16s\t%-16s\t%-10s\n", cust, host, rx, status
}
Based on the example input, one can tackle this one using regexes and the match function as well. Testing it:
$ awk -f tst.awk input.txt
Column1 Column2 Column3 Column4
Custname Host Last RX Status
abc tm-1 0 offline
abc tm-2 0 offline
abc tm-3 0 offline
abc new-va-threat-01 0 offline
xyz tm-56 10708747 ok
def
uvw tm-23 34921829912 ok
new cust tm-1-3 33993187093 ok
a12 d32 ffg
bcd abc
mno opq
abc dhg pvt ltd. tm-10 145774401010 ok
abc dhg pvt ltd. tm-ngtm-13 150159680874 ok
abc dhg pvt ltd. new-ngtm-11 207392526747 ok
abc dhg pvt ltd. old-ngtm-06 17708734533 ok
abc dhg pvt ltd. tm-08 559289251 ok
abc dhg pvt ltd. tm-12 534145552271 ok
Explanation:
customer_name:
, so $0 contains all host, rx and status information per customer."([^"]+)"
will capture the customer"HostName":\s"([^"]+)"\s+"LastDayRxBytes":\s(\S+)\s+"Status":\s"([^"]+)"\s
will capture hostname, rx and status.I know, this is not the awk way
of doing things, but then again the regular format of the input allows this - quite concise - regex-based solution.
Upvotes: 0
Reputation: 246744
I'd write this
awk -F": " -v OFS="\t" '
BEGIN {print "CustName", "Host", "Last RX", "Status"}
{
gsub(/"/,"")
sub(/^[[:blank:]]+/,"")
}
$1 == "customer_name" {
if ("customer_name" in data && !have_data)
print data["customer_name"]
have_data = 0
}
{
data[$1] = $2
}
("HostName" in data) && ("LastDayRxBytes" in data) && ("Status" in data) {
print data["customer_name"], data["HostName"], data["LastDayRxBytes"], data["Status"]
delete data["HostName"]
delete data["LastDayRxBytes"]
delete data["Status"]
have_data = 1
}
' file | column -s $'\t' -t
CustName Host Last RX Status
abc tm-1 0 offline
abc tm-2 0 offline
abc tm-3 0 offline
abc new-va-threat-01 0 offline
xyz tm-56 10708747 ok
def
uvw tm-23 34921829912 ok
new cust tm-1-3 33993187093 ok
a12 d32 ffg
bcd abc
mno opq
abc dhg pvt ltd. tm-10 145774401010 ok
abc dhg pvt ltd. tm-ngtm-13 150159680874 ok
abc dhg pvt ltd. new-ngtm-11 207392526747 ok
abc dhg pvt ltd. old-ngtm-06 17708734533 ok
abc dhg pvt ltd. tm-08 559289251 ok
abc dhg pvt ltd. tm-12 534145552271 ok
Upvotes: 1
Reputation: 241758
Perl to the rescue:
perl -lne '
if (/customer_name: "(.*)"/) {
print $h{name} unless $h{printed} || !%h;
undef $h{printed} if $1 ne $h{name};
$h{name} = $1;
} else {
/"([^"]+)": "?([^"]+)"?/ and $h{$1} = $2;
$h{printed} = print join "\t",
@h{qw{ name HostName LastDayRxBytes Status }}
if "Status" eq $1;
}
END { print $h{name} unless $h{printed} || !%h }
' < input_file
%h
hash is used to gather information about lines to be printed.Upvotes: 0