Reza
Reza

Reputation: 3919

Split a string on new lines while keeping delimiter in JavaScript

I have a string like below;

text = "\n first \n second \n third"

I want to split this string on new line character and keep the delimiter (\n and \r\n). So far I tried this text.split( /(?=\r?\n)/g ) The result is like below:

["↵ first ", "↵ second ", "↵ third"]

But I want this:

["↵", " first ↵", " second ↵", " third"]

What is the correct Regex for that?

Upvotes: 1

Views: 2112

Answers (4)

revo
revo

Reputation: 48711

You could match on [^\n]*\n? (enabling g flag):

text = "\n\n first \n\n sth \r with \r\n second \r\n third \n forth \r";
console.log(text.match(/[^\n]*\n?/g));

You may need to .pop() the returning values because the last value always is an empty string:

var matches = text.match(/[^\n]*\n?/g);
matches.pop();

Upvotes: 3

Wiktor Stribiżew
Wiktor Stribiżew

Reputation: 626758

You may match any text up to an CRLF or LF or end of string:

text.match(/.*(?:$|\r?\n)/g).filter(Boolean)
// -> (4) ["↵", " first ↵", " second ↵", " third"]

The .*(?:$|\r?\n) pattern matches

  • .* - any 0 or more chars other than newline
  • (?:$|\r?\n) - either end of string or an optional carriage return and a newline.

JS demo:

console.log("\r\n first \r\n second \r\n third".match(/.*(?:$|\r?\n)/g));
console.log("\n first \r\n second \r third".match(/.*(?:$|\r?\n)/g));
console.log("\n\n\n first \r\n second \r third".match(/.*(?:$|\r?\n)/g));

For ECMAScript 2018 standard supporting JS environments, it is as simple as using a lookbehind pattern like

text.split(/(?<=\r?\n)/)

It will split at all positions that immediately follow an optional CR + LF symbol.

Another splitting regex is /^(?!$)/m:

console.log("\r\n first \r\n second \r\n third".split(/^(?!$)/m));
console.log("\n first \r\n second \r third".split(/^(?!$)/m));
console.log("\n\n\n first \r\n second \r third".split(/^(?!$)/m));

Here, the strings are split at each position after a CR or LF that are not at the end of a line.

Note you do not need a global modifier with String#split since it splits at all found positions by default.

Upvotes: 2

Poul Bak
Poul Bak

Reputation: 10929

You can use this simple regex:

/.*?(\n|$)/g

It will match any number of any char including Newline '\n or end of string.

You can access the matches as an array (Works like splitting but keeps the separator in the match).

Upvotes: 0

Tim Biegeleisen
Tim Biegeleisen

Reputation: 521093

Your JavaScript version might not support lookbehinds. But here is a trick we can use which avoids them:

text = "\n first \n second \n third"
text = text.replace(/\n/mg, "\n\n");
terms = text.split(/\n(?!\n)/);
console.log(terms);

This works by replacing every newline \n with two of them \n\n, and then splitting on \n(?!\n). That is, after making this replacement, we split on \n which is not followed by another newline character. This results in consuming the second newline during the split, while retaining the first one which we want to appear in the output.

Upvotes: 2

Related Questions