Reputation: 2911
I've been trying to write a regex to match asterisks, tildes, dashes and square brackets.
What I have:
const str = "The] quick [brown] fox **jumps** over ~~the~~ lazy dog --- in the [woods";
console.log(str.match(/[^\]][^\[\]]*\]?|\]/g));
// [
// "The]",
// " quick ",
// "[brown]",
// " fox **jumps** over ~~the~~ lazy dog --- in the ",
// "[woods"
// ];
What I want:
[
"The]",
" quick ",
"[brown]",
" fox ",
"**jumps**",
" over ",
"~~the~~",
" lazy dog ",
"---",
" in the ",
"[woods"
];
Edit:
Some more examples/combinations of the string are:
"The] quick brown fox jumps [over] the lazy [dog"
// [ "The]", " quick brown fox jumps ", "[over]", " the lazy ", "[dog" ]
"The~~ quick brown fox jumps [over] the lazy **dog"
// [ "The~~", " quick brown fox jumps ", "[over]", " the lazy ", "**dog" ]
Edit 2:
I know this is crazy but:
"The special~~ quick brown fox jumps [over the] lazy **dog on** a **Sunday night."
// [ "The special~~", " quick brown fox jumps ", "[over the]", " lazy ", "**dog on**", " a ", "**Sunday night" ]
Upvotes: 1
Views: 526
Reputation: 785316
You may use this regex with more alternations to include your desired matches:
const re = /\[[^\[\]\n]*\]|\b\w+\]|\[\w+|\*\*.+?(?:\*\*|$)|-+|(?:^|~~).+?~~|[\w ]+/mg;
const arr = [
'The special~~ quick brown fox jumps [over the] lazy **dog on** a **Sunday night.',
'The] quick brown fox jumps [over] the lazy [dog',
'The] quick [brown] fox **jumps** over ~~the~~ lazy dog --- in the [woods'
];
var n;
arr.forEach( str => {
m = str.match(re);
console.log(m);
});
Upvotes: 1
Reputation: 147196
You can use this regex to split the string. It splits the string on text between one of the delimiters (**
, ~~
or []
) and either a matching delimiter or the start/end of the string; or on a sequence of hyphens (-
). It uses a capture group to ensure the string matched by the regex appears in the output array:
((?:\*\*|^)[A-Za-z. ]+(?:\*\*|$)|(?:~~|^)[A-Za-z. ]+(?:~~|$)|(?:\[|^)[A-Za-z. ]+(?:\]|$)|-+
const re = /((?:\*\*|^)[A-Za-z. ]+(?:\*\*|$)|(?:~~|^)[A-Za-z. ]+(?:~~|$)|(?:\[|^)[A-Za-z. ]+(?:\]|$)|-+)/;
const str = [
'The] quick [brown] fox **jumps** over ~~the~~ lazy dog --- in the [woods',
'The] quick brown fox jumps [over] the lazy [dog',
'The~~ quick brown fox jumps [over] the lazy **dog',
'The special~~ quick brown fox jumps [over the] lazy **dog on** a **Sunday night.'];
str.forEach(v => console.log(v.split(re).filter(Boolean)));
.as-console-wrapper { max-height: 100% !important; top: 0; }
Upvotes: 1