user1714423
user1714423

Reputation: 191

Removing newlines between tokens

I have a file that contains some information spanning multiple lines. In order for certain other bash scripts I have to work property, I need this information to all be on a single line. However, I obviously don't want to remove all newlines in the file.

What I want to do is replace newlines, but only between all pairs of STARTINGTOKEN and ENDINGTOKEN, where these two tokens are always on different lines (but never get jumbled up together, it's impossible for instance to have two STARTINGTOKENs in a row before an ENDINGTOKEN).

I found that I can remove newlines with tr "\n" " " and I also found that I can match patterns over multiple lines with sed -e '/STARTINGTOKEN/,/ENDINGTOKEN/!d'

However, I can't figure out how to combine these operations while leaving the remainder of the file untouched.

Any suggestions?

Upvotes: 2

Views: 563

Answers (4)

potong
potong

Reputation: 58420

This might work for you (GNU sed):

sed '/STARTINGTOKEN/!b;:a;$bb;N;/ENDINGTOKEN/!ba;:b;s/\n//g' file

or:

sed -r '/(START|END)TOKEN/,//{/STARTINGTOKEN/{h;d};H;/ENDINGTOKEN/{x;s/\n//gp};d}' file

Upvotes: -1

anubhava
anubhava

Reputation: 785146

Using awk:

awk '$0 ~ /STARTINGTOKEN/ || l {l=sprintf("%s%s", l, $0)} 
     /ENDINGTOKEN/{print l; l=""}' input.file

Upvotes: 0

William
William

Reputation: 4935

This seems to work:

 sed -ne '/STARTINGTOKEN/{ :next ; /ENDINGTOKEN/!{N;b next;}; s/\n//g;p;}' "yourfile"

Once it finds the starting token it loops, picking up lines until it finds the ending token, then removes all the embedded newlines and prints it. Then repeats.

Upvotes: 0

Kent
Kent

Reputation: 195059

are you looking for this?

 awk '/STARTINGTOKEN/{f=1} /ENDINGTOKEN/{f=0} {if(f)printf "%s",$0;else print}' file

example:

kent$  cat file
foo
bar
STARTINGTOKEN xx
1
2
ENDINGTOKEN yy
3
4
STARTINGTOKEN mmm
5
6
7
nnn ENDINGTOKEN
8
9

kent$  awk '/STARTINGTOKEN/{f=1} /ENDINGTOKEN/{f=0} {if(f)printf "%s",$0;else print}' file
foo
bar
STARTINGTOKEN xx12ENDINGTOKEN yy
3
4
STARTINGTOKEN mmm567nnn ENDINGTOKEN
8
9

Upvotes: 2

Related Questions