Birrel
Birrel

Reputation: 4834

JavaScript - Splitting string by multiple regex patterns returns undesired values

The Setup

I have a string, something like:

Sample Strings

val4[3].sub1[ 1 ][2].smth.esl

// or

val4.sub1

// or

val4.sub1[2].smth

// and so on...

The string will always abide by the following rules:

Think of the string in the same way you'd access items in an associative array. For example, the first string listed above might access an array like:

Sample Array Structure

val4 =  [
            0, 
            'some string', 
            2, 
            {
                sub1:   [
                            [1, 2, 3], 
                            [
                                'val',
                                1, 
                                {
                                    smth:   {
                                                esl: 'final value'
                                            }
                                }
                            ], 
                            [4, 5, 6]
                        ], 
                sub2: 'another str'
            }, 
            4
        ];

and the value at val4[3].sub1[ 1 ][2].smth.esl is final value.

Albeit the array is a nightmare, but just to help understand the acceptable patterns in the input strings.

Desired Behavior

What I am trying to do is split the string by the following:

So, using the first string as an example again:

val4[3].sub1[ 1 ][2].smth.esl

would split into

Desired Output

val4
[3]
.sub1
[ 1 ]
[2]
.smth
.esl

You can see at this RegExr what the desired groupings should be.

The Trials

JSFiddle of code

When I run the following:

Code Block #1

var str = 'val4[3].sub1[ 1 ][2].smth.esl';

var re = /(\[\s?[\d]+\s?\])|(?=\.([A-Za-z0-9_-]+))/g;   

var splits = str.split(re);

Note: In the regex expression, I've made the second pattern a positive lookahead (?=), so that the periods remain with the chars that follow

I end up with the following:

Output #1

splits = [
            "val4",
            "[3]",
            null,
            ".sub1",
            "[ 1 ]",
            null,
            "",
            "[2]",
            null,
            ".smth",
            null,
            "esl",
            ".esl"
         ];

The result is pretty close, with a few notable exceptions:

Running this with only the brackets part

Code Block #2

var re = /(\[\s?[\d]+\s?\])/g;

returns:

Output #2

splits = [
            "val4",
            "[3]",
            ".sub1",
            "[ 1 ]",
            "",
            "[2]",
            ".smth.esl"
         ];

Running this with only the period/chars part

Code Block #3

var re = /(?=\.([A-Za-z0-9_-]+))/g;

returns:

Output #3

splits = [
            "val4[3]",
            ".sub1[ 1 ][2]",
            "smth",
            ".smth",
            "esl",
            ".esl"
         ];

But both of the preceding have their own flaws (empty values, strings without their accompanying periods).

I've said "I'm so close" with regex stuff before, and have been miles off. But, I feel like I'm right on the cusp.

So, how can I mod the regex/split to accomplish what I'm after?

Basically, no null or empty values returned, and only strings with their preceding periods.

Hopefully I've been clear, please let me know if any ambiguity.

Important!

For proposed solutions, it needs to be...

  1. Browser-Independent

    This means likely nothing from ES6, since browser support for the new content is very inconsistent. People cannot be forced to adopt a browser just to run this.

  2. Easily ported over to PHP (Preferred, but not absolutely critical)

    This has a sister script written for server-side operations. Keeping away from JavaScript-Specific technologies will help implement it in both languages.

Additional (Not Essential)

(For those interested in the context and purpose)

Spoiler alert! There really is an array. Not identical to val4 defined above, but not dissimilar, and never the same twice.

val4 =  [... and so on...];

A string is provided (must be a string type)

str = 'val4[3].sub1[ 1 ][2].smth.esl';

And the return must be the value in val4, at the address in str.

var val4 =  [... and so on...];

var str = 'val4[3].sub1[ 1 ][2].smth.esl';

var result = getItem(val4, str);

Would literally return

return val4[3].sub1[1][2].smth.esl;

If you've got a bangin' way to do what I'm trying to achieve, without all the regex, I'm happy to hear it. But still interested in the primary issue laid-out in the question.

Upvotes: 0

Views: 247

Answers (3)

Redu
Redu

Reputation: 26161

OK for this job you need to be able to set the object properties dynamically. Previously i had done a code for that purpose. Object.prototype.setNestedValue().

setNestedValue([prop1[, prop2[, prop3...]]],value) works exactly like it's twin getNestedValue() but the last argument is the value to set. If the property does not exist it will create an object or array depenging on the type of the provided argument. Again, a string type argument will result an object while a number type argument will result an Array of that size.

Lets see the code.

Object.prototype.setNestedValue = function(...a) {
  a.length > 2 ? typeof this[a[0]] === "object" && this[a[0]] !== null ? this[a[0]].setNestedValue(...a.slice(1))
                                                                       : (this[a[0]] = typeof a[1] === "string" ? {} : new Array(a[1]),
                                                                         this[a[0]].setNestedValue(...a.slice(1)))
               : this[a[0]] = a[1];
  return this;
};

var str = "val4[3].sub1[ 1 ][2].smth.esl",
    arr = str.split(/\[\s*(\d+)\s*\]|\./)
             .filter(prop => prop)
             .map(prop => Number.isNaN(+prop) ? prop : +prop)
             .concat("final value");
 result = {}.setNestedValue(...arr);
 console.log(JSON.stringify(arr));
 console.log(JSON.stringify(result,null,4));

I first show how the resulting arguments array to invoke setNestedValue(...arr) and then the resulting object.

Oh my bad...! I thought you wanted to construct the object dynamically however you like to get the nested value dynamically from an already constructed object.. Object.prototype.getNestedValue() is ideal for this job. We already have our object constructed anyways so lets use the above code again and get the nested value dynamically.

Object.prototype.getNestedValue = function(...a) {
  return a.length > 1 ? (this[a[0]] !== void 0 && this[a[0]].getNestedValue(...a.slice(1))) : this[a[0]];
};

Object.prototype.setNestedValue = function(...a) {
  a.length > 2 ? typeof this[a[0]] === "object" && this[a[0]] !== null ? this[a[0]].setNestedValue(...a.slice(1))
                                                                       : (this[a[0]] = typeof a[1] === "string" ? {} : new Array(a[1]),
                                                                         this[a[0]].setNestedValue(...a.slice(1)))
               : this[a[0]] = a[1];
  return this;
};

var str = "val4[3].sub1[ 1 ][2].smth.esl",
    arr = str.split(/\[\s*(\d+)\s*\]|\./)
             .filter(prop => prop)
             .map(prop => Number.isNaN(+prop) ? prop : +prop)
             .concat("final value"),
  myObj = {}.setNestedValue(...arr),
  value;

arr.pop(); // lets remove the "final value" since now we will get it.
value = myObj.getNestedValue(...arr);
console.log(value);

Upvotes: 0

Thomas
Thomas

Reputation: 12637

I assume in the end it's this you want to build.

I slightly extended the Regex to also parse things like ["lorem ipsum"]

	//parses the path into an Array of keys
	var parsePath = (function(){
		var fetchParts = /\[\s*(\d+|"(?:\\[\s\S]|[^\\"])+")\s*\]|(?:\.|^)([A-Za-z_$][A-Za-z_$0-9]*)/g;
		var isValid = new RegExp("^(?:" + fetchParts.source.replace(/\((?:\?\:)?/g, "(?:") + ")+$", "")
		
		return function(str){
			var s = str == null? "": String(str).trim();
			if(!isValid.test(s))
				throw new Error("invalid path: "+JSON.stringify(str));

			for(var m, keys = []; m = fetchParts.exec(s);)
				keys.push(m[1]? JSON.parse(m[1]): m[2]);
			return keys;
		}
	})();

	//takes a path or an Array of keys and returns a function that resolves the path from the passed object.
	//returns undefined if path can't be resolved;
	
	//v1
	var resolvePath = function(path){
		var keys = Array.isArray(path)? path: parsePath(path);
		return keys.reduceRight(
			(nextFn, key) => obj => obj != null && key in obj? nextFn(obj[key]): void 0,
			v => v
		);
	}

	//v2
	var resolvePath = function(path){
		var keys = Array.isArray(path)? path: parsePath(path);
		return function(obj){
			for(var i=0; i<keys.length; ++i){
				if(obj == null) return void 0;
				obj = obj[keys[i]]
			}
			return obj;
		}
	}


	//usage:
	var getSmthEsl = resolvePath('val4[3].sub1[ 1 ][2]["lorem\\nipsum"].smth.esl');

	console.log("a", getSmthEsl({
		val4: [
			null,    //0
			null,    //1
			null,    //2
			{            //3
				sub1: [
					null,    //0
					[            //1
						null,    //0
						null,    //1
						{            //2
							"lorem\nipsum": {
								smth: {
									esl: {
										sucess: true
									}
								}
							}
						}
					]
				]
			}
		]
	}))
	
	console.log("b", getSmthEsl({ val4: [] }))

Upvotes: 1

georg
georg

Reputation: 214959

Much simpler than you think:

window.val4 =  [
            0, 
            'some string', 
            2, 
            {
                sub1:   [
                            [1, 2, 3], 
                            [
                                'val',
                                1, 
                                {
                                    smth:   {
                                                esl: 'final value'
                                            }
                                }
                            ], 
                            [4, 5, 6]
                        ], 
                sub2: 'another str'
            }, 
            4
        ];


let ref = 'val4[3].sub1[ 1 ][2].smth.esl'


let result = window; // or 'global' in node

ref.replace(/(\w+)|\[\s*(\d+)\s*\]/g, (_, $1, $2) => result = result[$1 || $2]);

console.log(result)

Upvotes: 1

Related Questions