Reputation: 837
I'm running a command line tool which returns results like this -
data {
metric: 0
metric: 1234.5
metric: 230499
metric: 234234
}
data {
metric: 0
metric: 6789
metric: 23526
metric: 234634767
}
I'd like to basically calculate (1234.5/6789).....the fraction between the 2nd lines in the 2 results. These numbers can be decimal numbers. The request will always be in that order. Is it possible through grep/sed?
Upvotes: 0
Views: 1434
Reputation: 104024
Here is a Perl solution.
Given:
$ echo "$tgt"
data {
metric: 0
metric: 1234.5
metric: 230499
metric: 234234
}
data {
metric: 0
metric: 6789
metric: 23526
metric: 234634767
}
You can use a regex in perl's 'slurp' mode to find the pairs you wish:
$ echo "$tgt" | perl -0777 -lne '
@a=/^data\s+\{\s+(?:metric:[\s\d.]+){1}metric:\s+(\d+(?:\.\d+)?)/gm;
print $a[0]/$a[1]
'
0.181838267786125
The value inside of the braces in (?:metric:[\s\d.]+){1}
, 1
in this case, will select which pair; 1234.5
and 6789
in this case.
Upvotes: 0
Reputation: 36036
grep
/sed
cannot perform arithmetic evaluation nor they have the ability to set state variables - so, no, this isn't. Basically, they aren't designed for anything beyond search and replace. This can be achieved with stunts coupling them with head
/bc
/etc but this is highly inconvenient and fragile.
This is possible with awk
(the code is tailored to be production-grade so it validates the input and adheres to the DRY principle):
function error(m){print m " at line " FNR ":`" $0 "'">"/dev/stderr";_error=1;exit 1;}
BEGIN{brace=0; #brace level
index_=0; #record index
v1="+NaN";v2=v1; #values; if either is not reassigned, the result will be NaN
first_section=0; #1st section ended
second_section=0; #2nd section ended
record_pattern="[[:space:]]*metric:[[:space:]]*([[:digit:]]+(\\.[[:digit:]]+)?)[[:space:]]*$";
}
END{if(_error)exit;
if (brace>0){error("invalid:unclosed section");}
if(!second_section){error("invalid:less than 2 sections present")}}
#section start
/^data[[:space:]]+\{[[:space:]]*$/{if(brace>0){error("invalid:nested brace");}brace+=1;next;}
#section end
/^\}[[:space:]]*$/{brace-=1;if(brace<0){error("invalid:unmatched brace")}index_=0;
if(!first_section){first_section=1;next;}
if(!second_section){second_section=1;}
next;}
#record
$0~record_pattern{
match($0,record_pattern,m); #awk cannot capture groups from the line pattern
if(brace==0)error("invalid:record outside a section");
if(index_==1){
if(!first_section){v1=m[1];}
else if(!second_section){v2=m[1];}}
index_++;next;
}
#anything else
{error("invalid:unrecognized syntax");}
#in the very end and if there were no errors
END{print v1/v2;}
Though equivalent programs in perl
and python
would be much more readable (and thus, maintainable).
Upvotes: 0
Reputation: 8140
Here's a solution using awk
:
#!/usr/bin/awk -f
BEGIN {
FS=" *\n? *[a-zA-Z]*: *"
RS="} *\n"
}
NR<=2 { a[NR] = $3 }
END { print (a[1]/a[2]) }
You can use that file with the command:
$ awk -f <awk-file> <data-file>
Or you can make it executable and call it directly.
awk
separates the input data into records which in turn are separated into fields. In the beginning, I carefully craft the record and field separators, so that the interesting metric is in the 3rd field of a record. (The first field is data {
)
Then for the first and second record, I store the 3rd fields in an array a.
At the end, I print the ratio between the first and second elements of the array.
Update: I managed to get it down to 3 lines:
BEGIN { RS="} *\n" }
NR<=2 { a[NR] = $6 }
END { print (a[1]/a[2]) }
Without setting the field separator, it remains at default. So $1
is data
, $2
is {
, $3
is the first metric:
, $4
is the first number, $5
is the second metric:
and $6
is the number we want.
Upvotes: 0
Reputation: 701
It looks like one of your requirements is to use bash commands (grep
, sed
, etc.) only. But you have to be aware that you will need something else to do your decimal division. The simplest choice is bc
.
Here is my suggestion using grep
, sed
, cut
and bc
. I did not try to compactify it. In theory, you should be able to use only one big sed
command!
./yourProgram | grep metric | sed -n 2~4p | sed -r 's/^\s+//' | cut -f2 -d' ' | sed 'N;s_\n_ / _' | bc -l
Let's go through it slowly:
grep metric
selects the lines containing "metric"sed -n 2~4p
selects one line out of four, starting from the second linesed -r 's/^\s+//'
suppresses the blank characters at the beginning of the lines. -r
is the enhanced regex option (to use \s
and +
), it is not mandatory but make it look nicer. With MacOS, you should use -E
cut -f2 -d' '
selects the 2nd field of each lines (the delimiter being a space)sed 'N;s_\n_ / _'
replaces the newline by " / ". Note that we use "_" instead of "/" to be able not to match "/"bc -l
does the operationUpvotes: 1
Reputation: 247012
Here's an obscure answer: Tcl. The syntax of that output is similar to Tcl syntax, so we can define a procedure named data
and a procedure named metric:
and execute that output like a Tcl script. You'd run it like this:
tclsh pct.tcl <(the process that produces the output)
And the "pct.tcl" script is:
#!/usr/bin/env tcl
set n 0
set values [dict create]
proc data {block} {
uplevel 1 $block
incr ::n
}
proc metric: {value} {
dict lappend ::values $::n $value
}
source [lindex $argv 0]
foreach num [dict get $values 0] denom [dict get $values 1] {
if {$denom == 0} {
puts "$num / $denom = Inf"
} else {
puts [format "%s / %s = %.2f" $num $denom [expr {double($num) / $denom}]]
}
}
output:
0 / 0 = Inf
1234.5 / 6789 = 0.18
230499 / 23526 = 9.80
234234 / 234634767 = 0.00
Upvotes: 1