Mister Jojo
Mister Jojo

Reputation: 22265

how to split text with regex match

starting from this kind of text;

var txt = "hello [a14] world, and [b74] some more [h85], ..."

I expect as result

var arr = 
  [ "hello "
  , "[a14]"
  , " world, and "
  , "[b74]"
  , " some more "
  , "[h85]"
  , ", ..."
  ] 

what I have tried so far, but mastering regular expressions is still difficult for me...

let txt = "hello [a14] world, and [b74] some more [h85], ..."

let arr = txt.match(/(\[(.*?)\])|(.*?)/g)

console.log( arr )
.as-console-wrapper { max-height: 100% !important; top: 0; }

Upvotes: 2

Views: 815

Answers (3)

Peter Thoeny
Peter Thoeny

Reputation: 7616

You can use a .split(). The trick is to capture the split condition, it will show up in the resulting array:

let txt = "hello [a14] world, and [b74] some more [h85], ..."
let regex = /(\[[^\]]*\])/;
let result = txt.split(regex);
console.log(result);

Output:

[
  "hello ",
  "[a14]",
  " world, and ",
  "[b74]",
  " some more ",
  "[h85]",
  ", ..."
]

Explanation:

  • ( - capture group start
  • \[ - scan over [
  • [^\]]* - scan over eanything not ]
  • \] - scan over ]
  • ) - capture group end

UPDATE based on comment:

Whitespace can be trimmed by changing above regex to / *(\[[^\]]*\]) */

Upvotes: 1

Tim Biegeleisen
Tim Biegeleisen

Reputation: 521008

You can also try splitting on lookarounds:

var text = "hello [a14] world, and [b74] some more [h85], ...";
var arr = text.split(/(?=\[)|(?<=\])/);
console.log(arr);

The logic here is to split the string whenever what follows is [ or what precedes in ]. Note that lookarounds are zero width, and so the pattern (?=\[)|(?<=\]) matches but does not actually consume any text in the input while splitting.

Upvotes: 4

ggorlen
ggorlen

Reputation: 56885

You can try /(\[[^\]\[]*\]|[^\]\[]+)/g. The differences are:

  • used + instead of * to grab at least once character and avoid matching empty strings
  • [^\]\[] gives you "any character not brackets" so you can avoid non-greedy matching with ?
  • shifted capturing groups around to be more intuitive

const txt = "hello [a14] world, and [b74] some more [h85], ...";
console.log(txt.match(/(\[[^\]\[]*\]|[^\]\[]+)/g));
.as-console-wrapper { max-height: 100% !important; top: 0; }

Upvotes: 3

Related Questions