Hewhoshallnotbenamed
Hewhoshallnotbenamed

Reputation: 35

Need a regex that finds "string" but not "[string]"

I'm trying to build a regular expression that parses a string and skips things in brackets.

Something like

string = "A bc defg hi [hi] jkl mnop.";

The .match() should return "hi" but not [hi]. I've spent 5 hours running through RE's but I'm throwing in the towel.

Also this is for javascript or jquery if that matters.

Any help is appreciated. Also I'm working on getting my questions formatted correctly : )

EDIT:

Ok I just had a eureka moment and figured out that the original RegExp I was using actually did work. But when I was replaces the matches with the [matches] it simply replaced the first match in the string... over and over. I thought this was my regex refusing to skip the brackets but after much time of trying almost all of the solutions below, I realized that I was derping Hardcore.

When .replace was working its magic it was on the first match, so I quite simply added a space to the end of the result word as follows:

var result = string.match(regex);
var modifiedResult = '[' + result[0].toString() + ']';
string.replace(result[0].toString() + ' ', modifiedResult + ' ');

This got it to stop targeting the original word in the string and stop adding a new set of brackets to it with every match. Thank you all for your help. I am going to give answer credit to the post that prodded me in the right direction.

Upvotes: 2

Views: 235

Answers (7)

Satyajit
Satyajit

Reputation: 3859

Instead of skipping the match you can probably try something different - match everything but do not capture the string within square brackets (inclusive) with something like this:

var r = /(?:\[.*?[^\[\]]\])|(.)/g;
var result;
var str = [];
while((result = r.exec(s)) !== null){
  if(result[1] !== undefined){ //true if [string] matched but not captured
    str.push(result[1]);
  }
}
console.log(str.join(''));

The last line will print parts of the string which do not match the [string] pattern. For example, when called with the input "A [bc] [defg] hi [hi] j[kl]u m[no]p." the code prints "A hi ju mp." with whitespaces intact. You can try different things with this code e.g. replacing etc.

Upvotes: 0

Francisco Meza
Francisco Meza

Reputation: 883

preprocess the target string by removing everything between brackets before trying to match your RE

string = "A bc defg hi [hi] jkl mnop."
tmpstring = string.replace(/\[.*\]/, "")

then apply your RE to tmpstring

correction: made the match for brackets eager per nhahtd comment below, and also, made the RE global

string = "A bc defg hi [hi] jkl mnop."
tmpstring = string.replace(/\[.*?\]/g, "")

Upvotes: 3

Gaël Barbin
Gaël Barbin

Reputation: 3919

This builds an array of all the strings contained in [ ]:

var regex = /\[([^\]]*)\]/;
var string = "A bc defg hi [hi] [jkl] mnop.";
var results=[], result;
while(result = regex.exec(string))
    results.push(result[1]);

edit

To answer to the question, this regex returns the string less all is in [ ], and trim whitespaces:

"A bc defg [hi] mnop [jkl].".replace(/(\s{0,1})\[[^\]]*\](\s{0,1})/g,'$1')

Upvotes: 0

Zhais
Zhais

Reputation: 1541

Using only Regular Expressions, you can use:

hi(?!])

as an example.

Look here about negative lookahead: http://www.regular-expressions.info/lookaround.html Unfortunately, javascript does not support negative lookbehind.

I used http://regexpal.com/ to test, abcd[hi]jkhilmnop as test data, hi(?!]) as the regex to find. It matched 'hi' without matching '[hi]'. Basically it matched the 'hi' so long as there was not a following ']' character.

This of course, can be expanded if needed. This has a benefit of not requiring any pre-processing for the string.

Upvotes: 1

bart
bart

Reputation: 7777

What do yo uwant to do with it? If you want to selectively replace parts like "hi" except when it's "[hi]", then I often use a system where I match what I want to avoid first and then what I want to watch; if it matches what I want to avoid then I return the match, otherwise I return the processed match.

Like this:

return string.replace(/(\[\w+\])|(\w+)/g, function(all, m1, m2) {return m1 || m2.toUpperCase()});

which, with the given string, returns:

"A BC DEFG HI [hi] JKL MNOP."

Thus: it replaces every word with uppercase (m1 is empty), except if the word is between square brackets (m1 is not empty).

Upvotes: 0

user1726343
user1726343

Reputation:

You don't necessarily need regex for this. Simply use string manipulation:

var arr = string.split("[");
var final = arr[0] + arr[1].split("]")[1];

If there are multiple bracketed expressions, use a loop:

while (string.indexOf("[") != -1){
    var arr = string.split("[");
    string = arr[0] + arr.slice(1).join("[").split("]").slice(1).join("]");
}

Upvotes: 1

whatyouhide
whatyouhide

Reputation: 16781

r"\[(.*)\]"

Just play arounds with this if you wanto to use regular expressions.

Upvotes: 0

Related Questions