DaveA
DaveA

Reputation: 217

Need to join certain lines based on multiple-line pattern match

I have a file that looks like this:

2014-05-01 00:30:45,511
ZZZ|1|CE|web1||etc|etc
ZZZ|1|CE|web2||etc|etc
ZZZ|1|CE|web3|asd|SDAF
2014-05-01 00:30:45,511
ZZZ|1|CE|web1||etc|etc
ZZZ|1|CE|web2||etc|etc
ZZZ|1|CE|web3|asd|SDAF

I want to convert this into 2 lines by replacing the newlines followed by certain patterns with pipes. I want:

2014-05-01 00:30:45,511|ZZZ|1|CE|web1||etc|etc|ZZZ|1|CE|web2||etc|etc|ZZZ|1|CE|web3|asd|SDAF
2014-05-01 00:30:45,511|ZZZ|1|CE|web1||etc|etc|ZZZ|1|CE|web2||etc|etc|ZZZ|1|CE|web3|asd|SDAF

I am trying multiline match with perl:

cat file | perl -pe 's/\nZZZ/\|ZZZ/m'

but this does not match.

I can do perl -pe 's/\n//m' but that is too much; I need to match '\nZZZ' so that only lines beginning with ZZZ are joined to the previous line.

Upvotes: 1

Views: 536

Answers (4)

Miller
Miller

Reputation: 35198

You just need to indicate slurp mode using the -0777 switch because you're using a regular expression that's trying to match across multiple lines.

The full solution:

perl -0777 -pe 's/\n(?=ZZZ)/|/g' file 

Explanation:

Switches:

  • -0777: slurp files whole
  • -p: Creates a while(<>){...; print} loop for each line in your input file.
  • -e: Tells perl to execute the code on command line.

Code:

  • s/\n(?=ZZZ)/|/g: Replace any newline that is followed by ZZZ with a |

Upvotes: 3

Richard RP
Richard RP

Reputation: 525

Try this if you want to avoid slurp mode:

perl -pe 'chomp unless eof; /\|/ and s/^/|/ or $.>1 and s/^/\n/' filename.txt
  • Add a record separator to the beginning of the line if it contains record separators.
  • Otherwise start a new line if we are past the first line.
  • Keep the new line at the end of the file.

Upvotes: 2

Borodin
Borodin

Reputation: 126722

This is a pretty standard pattern. It looks like this. The path to the input file is expected as a parameter on the command line

use strict;
use warnings;

my $line;
while (<>) {
  chomp;
  if ( /^ZZZ/ ) {
    $line .= '|' . $_;
  }
  else {
    print $line, "\n" if $line;
    $line = $_;
  }
}
print $line, "\n" if $line;

output

2014-05-01 00:30:45,511|ZZZ|1|CE|web1||etc|etc|ZZZ|1|CE|web2||etc|etc|ZZZ|1|CE|web3|asd|SDAF
2014-05-01 00:30:45,511|ZZZ|1|CE|web1||etc|etc|ZZZ|1|CE|web2||etc|etc|ZZZ|1|CE|web3|asd|SDAF

Upvotes: 0

user1558455
user1558455

Reputation:

I would suggest using a Lookahead, which does not kill your ZZZ Part

cat file | perl -pe 's/(\n(?=ZZZ))/|/gm'

EDIT: Online Demo

Upvotes: 0

Related Questions