Testaccount
Testaccount

Reputation: 2911

Regex to match asterisks, tildes, dashes and square brackets

I've been trying to write a regex to match asterisks, tildes, dashes and square brackets.

What I have:

const str = "The] quick [brown] fox **jumps** over ~~the~~ lazy dog --- in the [woods";
console.log(str.match(/[^\]][^\[\]]*\]?|\]/g));
// [
//     "The]",
//     " quick ",
//     "[brown]",
//     " fox **jumps** over ~~the~~ lazy dog --- in the ",
//     "[woods"
// ];

What I want:

[
    "The]",
    " quick ",
    "[brown]",
    " fox ",
    "**jumps**",
    " over ",
    "~~the~~",
    " lazy dog ",
    "---",
    " in the ",
    "[woods"
];

Edit:

Some more examples/combinations of the string are:

"The] quick brown fox jumps [over] the lazy [dog"
// [ "The]", " quick brown fox jumps ", "[over]", " the lazy ", "[dog" ]


"The~~ quick brown fox jumps [over] the lazy **dog"
// [ "The~~", " quick brown fox jumps ", "[over]", " the lazy ", "**dog" ]

Edit 2:

I know this is crazy but:

"The special~~ quick brown fox jumps [over the] lazy **dog on** a **Sunday night."
// [ "The special~~", " quick brown fox jumps ", "[over the]", " lazy ", "**dog on**", " a ", "**Sunday night" ]

Upvotes: 1

Views: 526

Answers (2)

anubhava
anubhava

Reputation: 785316

You may use this regex with more alternations to include your desired matches:

const re = /\[[^\[\]\n]*\]|\b\w+\]|\[\w+|\*\*.+?(?:\*\*|$)|-+|(?:^|~~).+?~~|[\w ]+/mg;
const arr = [
'The special~~ quick brown fox jumps [over the] lazy **dog on** a **Sunday night.',
'The] quick brown fox jumps [over] the lazy [dog',
'The] quick [brown] fox **jumps** over ~~the~~ lazy dog --- in the [woods'
];

var n;
arr.forEach( str => {
  m = str.match(re);
  console.log(m);
});

RegEx Demo

Upvotes: 1

Nick
Nick

Reputation: 147196

You can use this regex to split the string. It splits the string on text between one of the delimiters (**, ~~ or []) and either a matching delimiter or the start/end of the string; or on a sequence of hyphens (-). It uses a capture group to ensure the string matched by the regex appears in the output array:

((?:\*\*|^)[A-Za-z. ]+(?:\*\*|$)|(?:~~|^)[A-Za-z. ]+(?:~~|$)|(?:\[|^)[A-Za-z. ]+(?:\]|$)|-+

const re = /((?:\*\*|^)[A-Za-z. ]+(?:\*\*|$)|(?:~~|^)[A-Za-z. ]+(?:~~|$)|(?:\[|^)[A-Za-z. ]+(?:\]|$)|-+)/;
const str = [
  'The] quick [brown] fox **jumps** over ~~the~~ lazy dog --- in the [woods',
  'The] quick brown fox jumps [over] the lazy [dog',
  'The~~ quick brown fox jumps [over] the lazy **dog',
  'The special~~ quick brown fox jumps [over the] lazy **dog on** a **Sunday night.'];

str.forEach(v => console.log(v.split(re).filter(Boolean)));
.as-console-wrapper { max-height: 100% !important; top: 0; }

Upvotes: 1

Related Questions