Reputation: 7411

How can I write this sed/bash command in awk or perl (or python, or ...)?

I need to replace instances of Progress (n,m) and Progress label="some text title" (n,m) in a scripting language with new values (N,M) where

N= integer ((n/m) * normal)
M= integer ( normal )

The progress statement can be anywhere on the script line (and worse, though not with current scripts, split across lines).

The value normal is a specified number between 1 and 255, and n and m are floating point numbers

So far, my sed implementation is below. It only works on Progress (n,m) formats and not Progress label="Title" (n,m) formats, but its just plain nuts:

#!/bin/bash
normal=$1; 
file=$2
for n in $(sed -rn '/Progress/s/Progress[ \t]+\(([0-9\. \t]+),([0-9\. \t]+)\).+/\1/p' "$file" )
do 
    m=$(sed -rn "/Progress/s/Progress[ \t]+\(${n},([0-9\. \t]+).+/\1/p" "$file")
    N=$(echo "($normal * $n)/$m" | bc)
    M=$normal
    sed -ri "/Progress/s/Progress[ \t]+\($n,$m\)/Progress ($N,$M)/" "$file"
done

Simply put: This works, but, is there a better way?

My toolbox has sed and bash scripting in it, and not so much perl, awk and the like which I think this problem is more suited to.

Edit Sample input.

Progress label="qt-xx-95" (0, 50) thermal label "qt-xx-95" ramp(slew=.75,sp=95,closed) Progress (20, 50) Pause  5 Progress (25, 50) Pause  5 Progress (30, 50) Pause  5 Progress (35, 50) Pause  5 Progress (40, 50) Pause  5 Progress (45, 50) Pause  5 Progress (50, 50)
Progress label="qt-95-70" (0, 40) thermal label "qt-95-70" hold(sp=70)        Progress (10, 40) Pause  5 Progress (15, 40) Pause  5 Progress (20, 40) Pause  5 Progress (25, 40) Pause  5

Upvotes: 3

Answers (3)

frankc

Reputation: 11473

This is somewhat brittle but it seems to do the trick? It could be changed to a one-line with perl -pe but I think this is clearer:


use 5.16.0;
my $normal = $ARGV[0];
while(<STDIN>){
        s/Progress +(label=\".+?\")? *( *([0-9. ]+) *, *([0-9. ]+) *)/sprintf("Progress $1 (%d,%d)", int(($2/$3)*$normal),int($normal))/eg;
        print $_;

}

The basic idea is to optionally capture the label clause in $1, and to capture n and m into $2 and $3. We use perl's ability to replace the matched string with an evaluated piece of code by providing the "e" modifier. It's going to fail dramatically if the label clause has any escaped quotes or contains the string that matches something that looks like a Progress toekn, so its not ideal. I agree that you need an honest to goodness parser here, though you could modify this regex to correct some of the obvious deficiencies like the weak number matching for n and m.

Upvotes: 1

Thor

Reputation: 47089

awk has good splitting capabilities, so it might be a good choice for this problem.

Here is a solution that works for the supplied input, let's call it update_m_n_n.awk. Run it like this in bash: awk -f update_m_n_n.awk -v normal=$NORMAL input_file.

#!/usr/bin/awk

BEGIN {
  ORS = RS = "Progress"
  FS = "[)(]"
  if(normal == 0) normal = 10
}

NR == 1 { print }

length > 1 { 
  split($2, A, /, */)
  N = int( normal * A[1] / A[2] )
  M = int( normal )
  sub($2, N ", " M)
  print $0
}

Explanation

ORS = RS = "Progress": Split sections at Progress and include Progress in the output.
FS = "[)(]": Separate fields at parenthesis.
NR == 1 { print }: Insert ORS before the first section.
split($2, A, /, */): Assuming there is only on parenthesized item between occurrences of Progress, this splits m and n into the A array.
sub($2, N ", " M): Substitute the new values the into current record.

Upvotes: 1

crw

Reputation: 685

My initial thought was to try sed with recursive substitutions (t command), however I suspected that would get stuck.

This perl code might work for statements that are not split across lines. For splits across lines, perhaps it makes sense to write a separate pre-processor to join disparate lines.

The code splits "Progress" statements into separate line-segments, applies any replacement rules then rejoins the segments into one line and prints. Non-matching lines are simply printed. The matching code uses back-references and becomes somewhat unreadable. I have assumed your "normal" parameter can take floating values as the spec didn't seem clear.

#!/usr/bin/perl -w

use strict;

die("Wrong arguments") if (@ARGV != 2);
my ($normal, $file) = @ARGV;
open(FILE, '<', $file) or die("Cannot open $file");

while (<FILE>) {
    chomp();
    my $line = $_;

    # Match on lines containing "Progress"
    if (/Progress/) {

        $line =~ s/(Progress)/\n$1/go;    # Insert newlines on which to split
        my @segs = split(/\n/, $line);    # Split line into segments containing possibly one "Progress" clause

        # Apply text-modification rules
        @segs = map {
            if (/(Progress[\s\(]+)([0-9\.]+)([\s,]+)([0-9\.]+)(.*)/) {
                my $newN = int($2/$4 * $normal);
                my $newM = int($normal);
                $1 . $newN . $3 . $newM . $5;
            } elsif (/(Progress\s+label="[^"]+"[\s\(]+)([0-9\.]+)([\s,]+)([0-9\.]+)(.*)/) {
                my $newN = int($2/$4 * $normal);
                my $newM = int($normal);
                $1 . $newN . $3 . $newM . $5;
            } else {
                $_;    # Segment doesn't contain "Progress"
            }
        } @segs;

        $line = join("", @segs);    # Reconstruct the single line
    }

    print($line,"\n");    # Print all lines
}

Upvotes: 0

How can I write this sed/bash command in awk or perl (or python, or ...)?

Answers (3)

Explanation

Related Questions