Reputation: 79
As seen in the snippet, I have a Regex that should identify any empty lines. (I'm aware I can just do /n/n, but it doesn't suit my purposes). I've tested it in a word editor, and it picks up every new line when using the find tool. But in JS, I'm still getting the entire file. What am I missing, here?
const mockData = `This is some fake data
with multiple sentences
and line breaks`;
const newArr = mockData.split(/^\s*$/);
console.log(newArr[0]);
Upvotes: 0
Views: 47
Reputation: 29115
You have a multiline string but aren't using the m
(multiline) flag. Without it ^
and $
match the start/end of the entire string, so you'd only split if the entirety of the string was composed of whitespace:
//multiline - all whitespace
const mockData = `
`;
const newArr = mockData.split(/^\s*$/);
console.log(newArr);
Using the m
flag, the ^
and $
characters instead match start/end of each line. So now the regex works to split on lines that are either empty or composed of newline characters:
const mockData = `This is some fake data
with multiple sentences
and line breaks`;
const newArr = mockData.split(/^\s*$/m);
console.log(newArr);
If you intend to split at newlines and empty lines leaving no blanks, then you can eschew the ^
and $
characters entirely, since they are actually more trouble. The engine might do a split before a newline because that's the end of the line $
. So, instead of trying to get around that with more regex, just split on whitespace + a newline or newline + whitespace.
const mockData = `This is some fake data
with multiple sentences
and line breaks`;
const newArr = mockData.split(/\s*[\r\n]+|[\r\n]+\s*/);
console.log(newArr);
With this you don't need to use the multiline flag, since you never use the behaviour it introduces.
Also, I should note that [\r\n]+
is a slight cheat on my part. The end of line characters are either \r\n
or just \n
, you very likely will never encounter a simple \r
. However, the proper regex then is \r?\n
which I find ugly, especially if you try to repeat it - (\r?\n)+
. A character class is ever so slightly inaccurate yet in a way that should never have any effect on accuracy.
Upvotes: 1
Reputation: 178422
Using the multiline flag works better
const newArr = mockData.split(/\s*$/m);
Take your pick
const re1 = /^\s*|\s*$/m
const re2 = /^\s*$/m
const re3 = /\s*$/m
const mockData = `This is some fake data
with multiple sentences
and line breaks`;
const newArr1 = mockData.split(re1);
console.log(JSON.stringify(newArr1))
const newArr2 = mockData.split(re2);
console.log(JSON.stringify(newArr2))
const newArr3 = mockData.split(re3);
console.log(JSON.stringify(newArr3))
Upvotes: 0