rainman168
rainman168

Reputation: 21

Using SED to remove newlines for an address range

I am having some difficulty getting my SED script to work properly. It appears to only work on the first occurrence. I am basically a UNIX beginner - please bear with me.

Data file looks like this:

exec cics 
end-exec.
exec cics 
send map
end-exec.
exec cics 
end-exec.

Actual output is as follows and appears to work correctly only on the first occurrence:

exec cics end-exec.
exec cics 
send map
end-exec.
exec cics 
end-exec.

Desired output should be as follows:

exec cics end-exec.
exec cics send map end-exec.
exec cics end-exec.

Everything starting with "exec cics" and ending with "end-exec" should be on one line with any newlines removed.

SED script is as follows:

/exec cics/,/end-exec/{
:a    
N
$!ba
s/\n//
}

I had got the code within the curly braces from here: How can I replace a newline (\n) using sed?

My initial script did not have the :a;N;$!ba. Can anyone see what I am missing or doing wrong?

Upvotes: 1

Views: 350

Answers (3)

Jonathan Leffler
Jonathan Leffler

Reputation: 754520

This modified version of your sed script seems to do the job:

/exec cics/,/end-exec/{
    :a
    /end-exec/! N
    s/\n/ /g
    t a
}

When saved in the file sed.script, and given the data file (data2):

exec cics
end-exec.
exec cics
send map
end-exec.
exec cics
end-exec.
exec cics
do this
and that
and tother
end-exec.

(with no trailing blanks on any of the data lines), then I get:

$ sed -f sed.script data2
exec cics end-exec.
exec cics send map end-exec.
exec cics end-exec.
exec cics do this and that and tother end-exec.
$

What does the script do?

  1. For each range of lines between exec cics and end-exec,
  2. Set label a,
  3. If the pattern space doesn't contain end-exec add another line to it.
  4. Replace any newlines with spaces.
  5. If there was a substitution, jump back to label a.

While debugging/devising this, I added a couple of extra lines after the t and before the }:

s/^/[[/
s/$/]]/

This helped me see how the data was handled with various other versions of the commands, including the original; the lines were enclosed in [[ and ]] as they were printed.

Tested on Mac OS X 10.9.2 Mavericks using the BSD sed that's supplied.

Upvotes: 1

devnull
devnull

Reputation: 123608

The result that you get can be explained by the fact that you substitute only the first newline:

s/\n//

However, even if you performed the substitution globally, i.e. used:

s/\n//g

you'd get:

exec cics end-exec. exec cics  send map end-exec. exec cics  end-exec.

because the $ address would match the last occurrence in the file.

Instead of branching to the label a unless you match the last line, don't branch when you encounter end-exec. Saying:

sed '/exec cics/,/end-exec/{:a;N;/end-exec/!ba;s/\n/ /g}' filename

would produce:

exec cics end-exec.
exec cics send map end-exec.
exec cics end-exec.

If your input consists of contiguous blocks starting from exec cisc and ending with end-exec, you could simplify it:

sed ':a;N;/end-exec/!ba;s/\n/ /g' filename

Upvotes: 2

Jotne
Jotne

Reputation: 41460

If you like to try awk

awk '{printf (/exec cics/ && NR>1?RS:"")"%s",$0} END {print ""}'
exec cics end-exec.
exec cics send mapend-exec.
exec cics end-exec.

If line starts with exec cics and not is first line, add newline before the line.
Else, just print the data in one line.

Upvotes: 1

Related Questions