user1902824
user1902824

Reputation:

Sed: Complicated replace after pattern (on same line)

Suppose you have some text like this:

  foobar 42                  | ff 00 00 00 00
  foobaz 00                  | 0a 00 0b 00 00
  foobie 00                  | 00 00 00 00 00
  bar    00                  | ab ba 00 cd 00

and you want to change all non-00 on the right hand side of the | to be wrapped with (), but only if on the LHS of the | has 00. The desired result:

  foobar 42                  | ff 00 00 00 00
  foobaz 00                  | (0a) 00 (0b) 00 00
  foobie 00                  | 00 00 00 00 00
  bar    00                  | (ab) (ba) 00 (cd) 00

Is there a good way of going about this using sed, or am I trying to stretch beyond the capabilities of the language?

Here's my work so far:

s/[^0]\{2\}/(&)/g wraps your RHS values

/[^|]*00[^|]*|/ can be used as an address to a command to operate only on valid lines

The trick now is to formulate a command that executes in a portion of the pattern space.

This really isn't line oriented, which may explain why I'm having trouble getting an expression that works.

Upvotes: 4

Views: 548

Answers (5)

Dru
Dru

Reputation: 1428

well it seems, (though I do it all the time) that piping sed to sed to sed means I didn't do it right the first time: Here's one

sed -r '/00.*\|/  {   ## match lines with a zero before the pipe

    ### surround tailing digits with ()
    ##  
     s/(\w\w) (\w\w) (\w\w) (\w\w) (\w\w)$/(\1) (\2) (\3) (\4) (\5)/;  

    ### replace the zeroes (00) with 00
    ##
    s/\(00\)/00/g; 

}'  txt
  foobar 42                  | ff 00 00 00 00
  foobaz 00                  | (0a) 00 (0b) 00 00
  foobie 00                  | 00 00 00 00 00
  bar    00                  | (ab) (ba) 00 (cd) 00

ok!

Upvotes: 2

potong
potong

Reputation: 58361

This might work for you (GNU sed):

 sed -r '/^\s*\S+\s*00/!b;s/\b([^0][^0]|0[^0]|[^0]0)\b/(&)/g' file

This disregards lines which do not begin with a word followed by 00. It then inserts parens round 2 character strings which are neither 0's or contain a 0 and a non-0.

Upvotes: 3

Ed Morton
Ed Morton

Reputation: 203169

$ awk 'BEGIN{ FS=OFS="|" } $1~/ 00 /{gsub(/[^ ][^0 ]|[^0 ][^ ]/,"(&)",$2)} 1' file
  foobar 42                  | ff 00 00 00 00
  foobaz 00                  | (0a) 00 (0b) 00 00
  foobie 00                  | 00 00 00 00 00
  bar    00                  | (ab) (ba) 00 (cd) 00

In case the string you want to search for ever gets more complicated than 2 0s, here's a more generally extensible approach since it doesn't require you to write an RE that negates the string:

$ awk '
    BEGIN{ FS=OFS="|" }
    $1 ~ / 00 /{
        split($2,a,/ /)
        $2=""
        for (i=2;i in a;i++)
            $2 = $2 " " (a[i] == "00" ? a[i] : "(" a[i] ")")
    }
    1
' file
  foobar 42                  | ff 00 00 00 00
  foobaz 00                  | (0a) 00 (0b) 00 00
  foobie 00                  | 00 00 00 00 00
  bar    00                  | (ab) (ba) 00 (cd) 00

Upvotes: 4

chris-l
chris-l

Reputation: 2841

Ok try this!

$ sed '/00 *|/ { h; s/|.*/|/; x; s/.*|//; s/\(0[1-9a-f]\|[1-9a-f][0-9a-f]\)/(\1)/g; H; x; s/\n//; }' yourfile.txt

the output I get is this:

foobar 42                  | ff 00 00 00 00
foobaz 00                  | (0a) 00 (0b) 00 00
foobie 00                  | 00 00 00 00 00
bar    00                  | (ab) (ba) 00 (cd) 00

Edited, so it don't touch the line without 00 before the |.

Upvotes: 1

Jonathan Leffler
Jonathan Leffler

Reputation: 753495

I think awk is probably the better tool for this job, but it can be done with sed:

sed '/^[^ ]*  *00 *|/{
         :a
         s/\(|.*[^(]\)\([0-9a-f][1-9a-f]\)/\1(\2)/
         t a
         :b
         s/\(|.*[^(]\)\([1-9a-f][0-9a-f]\)/\1(\2)/
         t b
     }' data

The script looks for lines containing 00 before the pipe, and only applies the operations to those lines. There are two substitute operations, each wrapped in a loop. The :a and :b lines are labels. The t a and t b commands are a conditional jump to the named label if there was a substitution performed since the last jump. The two substitutions are almost symmetric; the first deals with any number not ending in 0; the second deals with any number not starting with 0; between them, they ignore 00. The patterns look for a pipe, any sequence of characters not ending with an open parenthesis (, and the appropriate pair of digits; it replaces that so that the number ends up inside parentheses. The loops are necessary because a g modifier doesn't start from the beginning again, and the patterns work backwards through the numbers.

Given this data file (a slightly extended version of yours):

foobar 42                  | ff 00 00 00 00
foobaz 00                  | 0a 00 0b 00 00
foobie 00                  | 00 00 00 00 00
bar    00                  | ab ba 00 cd 00
fizbie    00               | ab ba 00 cd 90
fizzbuzz    00             | ab ba 00 cd 09

the output from the script is:

foobar 42                  | ff 00 00 00 00
foobaz 00                  | (0a) 00 (0b) 00 00
foobie 00                  | 00 00 00 00 00
bar    00                  | (ab) (ba) 00 (cd) 00
fizbie    00               | (ab) (ba) 00 (cd) (90)
fizzbuzz    00             | (ab) (ba) 00 (cd) (09)

It is moderately educational to add a p after each of the substitute commands, so you can see how the substitutions work:

foobar 42                  | ff 00 00 00 00
foobaz 00                  | 0a 00 (0b) 00 00
foobaz 00                  | (0a) 00 (0b) 00 00
foobaz 00                  | (0a) 00 (0b) 00 00
foobie 00                  | 00 00 00 00 00
bar    00                  | ab ba 00 (cd) 00
bar    00                  | ab (ba) 00 (cd) 00
bar    00                  | (ab) (ba) 00 (cd) 00
bar    00                  | (ab) (ba) 00 (cd) 00
fizbie    00               | ab ba 00 (cd) 90
fizbie    00               | ab (ba) 00 (cd) 90
fizbie    00               | (ab) (ba) 00 (cd) 90
fizbie    00               | (ab) (ba) 00 (cd) (90)
fizbie    00               | (ab) (ba) 00 (cd) (90)
fizzbuzz    00             | ab ba 00 cd (09)
fizzbuzz    00             | ab ba 00 (cd) (09)
fizzbuzz    00             | ab (ba) 00 (cd) (09)
fizzbuzz    00             | (ab) (ba) 00 (cd) (09)
fizzbuzz    00             | (ab) (ba) 00 (cd) (09)

Upvotes: 1

Related Questions