Reputation: 3919
I have a string like below;
text = "\n first \n second \n third"
I want to split this string on new line character and keep the delimiter (\n and \r\n). So far I tried this text.split( /(?=\r?\n)/g )
The result is like below:
["↵ first ", "↵ second ", "↵ third"]
But I want this:
["↵", " first ↵", " second ↵", " third"]
What is the correct Regex for that?
Upvotes: 1
Views: 2112
Reputation: 48711
You could match on [^\n]*\n?
(enabling g
flag):
text = "\n\n first \n\n sth \r with \r\n second \r\n third \n forth \r";
console.log(text.match(/[^\n]*\n?/g));
You may need to .pop()
the returning values because the last value always is an empty string:
var matches = text.match(/[^\n]*\n?/g);
matches.pop();
Upvotes: 3
Reputation: 626758
You may match any text up to an CRLF or LF or end of string:
text.match(/.*(?:$|\r?\n)/g).filter(Boolean)
// -> (4) ["↵", " first ↵", " second ↵", " third"]
The .*(?:$|\r?\n)
pattern matches
.*
- any 0 or more chars other than newline(?:$|\r?\n)
- either end of string or an optional carriage return and a newline.JS demo:
console.log("\r\n first \r\n second \r\n third".match(/.*(?:$|\r?\n)/g));
console.log("\n first \r\n second \r third".match(/.*(?:$|\r?\n)/g));
console.log("\n\n\n first \r\n second \r third".match(/.*(?:$|\r?\n)/g));
For ECMAScript 2018 standard supporting JS environments, it is as simple as using a lookbehind pattern like
text.split(/(?<=\r?\n)/)
It will split at all positions that immediately follow an optional CR + LF symbol.
Another splitting regex is /^(?!$)/m
:
console.log("\r\n first \r\n second \r\n third".split(/^(?!$)/m));
console.log("\n first \r\n second \r third".split(/^(?!$)/m));
console.log("\n\n\n first \r\n second \r third".split(/^(?!$)/m));
Here, the strings are split at each position after a CR or LF that are not at the end of a line.
Note you do not need a global modifier with String#split
since it splits at all found positions by default.
Upvotes: 2
Reputation: 10929
You can use this simple regex:
/.*?(\n|$)/g
It will match any number of any char including Newline
'\n or end of string.
You can access the matches as an array
(Works like splitting but keeps the separator in the match).
Upvotes: 0
Reputation: 521093
Your JavaScript version might not support lookbehinds. But here is a trick we can use which avoids them:
text = "\n first \n second \n third"
text = text.replace(/\n/mg, "\n\n");
terms = text.split(/\n(?!\n)/);
console.log(terms);
This works by replacing every newline \n
with two of them \n\n
, and then splitting on \n(?!\n)
. That is, after making this replacement, we split on \n
which is not followed by another newline character. This results in consuming the second newline during the split, while retaining the first one which we want to appear in the output.
Upvotes: 2