fionpo
fionpo

Reputation: 141

Regex combine lines

Given the following string

45op0
tr ico
JJB Be
tyuh
113-4997
202076
acure
sala mandra

I am looking for the following result:

45op0;113-4997
tr ico;202076
JJB Be;acure
tyuh;sala mandra

Basically combine the 4 lines at the bottom with the 4 at the top, in their original order, in a ; separated list.

This is the regex I have so far:

^((?:[^\r*\n]*[\r*\n]){4})([\s\S]*)

subtituted by:

$1;$2

as shown in this demo

As you can see, this does not give the expacted result.

Any help will be much appreciated.

Upvotes: 0

Views: 48

Answers (1)

Cary Swoveland
Cary Swoveland

Reputation: 110685

You can use the regular expression

^(.+)\r?\n(?=(?:.*\r?\n){3}(.+))

PCRE demo

For the example given there are four matches: 45op0, tr ico, JJB Be and tyuh. Each match has two capture groups. The first capture group contains the match itself. For the first match (45op0), capture group 2 contains contains 113-4997, which is captured in the positive lookahead. The contents of the two capture group can then be joined, separated by a semicolon, to return 45op0;113-4997

Similarly, for the second match capture group 2 contains 202076, and so on.

When the line 113-4997 is reached, it is saved in cap grp 1, the next three lines are consumed and then the regex fails because there is no non-empty line following. For the next lines the regex fails because it is unable to skip three lines.

The PCRE regex engine performs the following operations.

^(.+)          match a line with 1+ chars, excl. line terminators,
               in cap grp 1 
\r?\n          match the newline and possible carriage return
(?=            begin a positive lookahead
  (?:.*\r?\n)  match an entire line in a non-cap group          
  {3}          execute the non-cap group 3 times (skip 3 lines)
  (.+)         match a line with 1+ chars, excl. line terminators,
               in cap grp 2
)              end positive lookahead

Upvotes: 1

Related Questions