Vinod
Vinod

Reputation: 93

Parsing to an array based on multiple delimiters

I need to parse the following string (Parsing PDF, would like to avoid third party packages.).

/Type /Pages /MediaBox [0 0 612 792] /Count 9 /Kids [ 5 0 R 355 0 R ]

I am using Javascript:

String.split(' ');

The Output I would like to get is [ '/Type', '/Pages', '/MediaBox', '[0 0 612 792]', '/Count', '9', '/Kids', '[ 5 0 R 355 0 R]' ]

This results in: the following output: [ '<<', '/Type', '/Pages', '/MediaBox', '[0', '0', '612', '792]',

Specifically, I would like to delimit '[' and ']'. so that the string would read '[ 5, 0, R, 355, 0, R]'

The Final result expected is this:

I am trying to see if I can address this with regular expression and currently I am stuck.

Upvotes: 0

Views: 40

Answers (2)

alebianco
alebianco

Reputation: 2555

This regex should take care of it

var input = "/Type /Pages /MediaBox [0 0 612 792] /Count 9 /Kids [ 5 0 R 355 0 R ]"
var result = input.match(/(\[[^\]]+\]|\S+)/g)
console.log(result)

as an explanation, it groups every character that is not ] between the characters [ and ] ([[^]]+]) OR a sequence of characters that is not a space (\S+)

Upvotes: 2

Rajesh
Rajesh

Reputation: 24955

You can use a regex that will return [...] groups and then you can replace spaces with comma. Then, you just have to split it by spaces

var s = "/Type /Pages /MediaBox [0 0 612 792] /Count 9 /Kids [ 5 0 R 355 0 R ]";

var arr_reg = /\[(.*?)(?:\]|$)/g;
s = s.replace(arr_reg, function(str){
  str = str.substring(1,str.length-1);
  return "[" + str.trim().replace(/ /g, ',') + "]"
});
console.log(s.split(' '))

Upvotes: 1

Related Questions