warkentien2
warkentien2

Reputation: 979

RegEx leaves unwanted space behind

I have a list of words separated by comma. e.g.: list.join(' ');
How do I remove a word (variable) using RegEx and without leaving a space behind?

view example code:

var testClasses = document.getElementsByTagName("div")[0].className;
var classToRemove = "test3";

document.getElementsByTagName('p')[0].innerHTML = "Removing class ." + classToRemove + " from: <strong>" + testClasses + "</strong>";

var re = new RegExp(classToRemove + "\s?", "g");
testClasses = testClasses.replace(re, "");

// I ran into the same problem trying to be more specific
// var re = new RegExp("(\S+\s?)*(" + classToRemove + "\s?)(\S+\s?)*", "g");
// testClasses = testClasses.replace(re, "$1$3");


document.getElementsByTagName('p')[1].innerHTML = "becomes: <strong>" + testClasses + "</strong>" + " // which looks great on the DOM.";
console.log(testClasses);
console.log(testClasses.split(' '));
<div class="test1 test2 test3 test4 test5"></div>
<p></p>
<p></p>
<p>However, if you check console, the space is there. <br><strong>How do I remove this extra space?</strong> Without having to run a second replace.</p>

Restrictions:

Upvotes: 1

Views: 146

Answers (2)

rodneyrehm
rodneyrehm

Reputation: 13557

May I interest you in Element.classList? This API allows mutating class attribute through convenient methods like .add(), .remove() and .toggle(). This is far superior to rolling your own RegExp solution.


If it doesn't have to be a RegExp solution, you could try Array.filter:

'alpha bravo charlie'
  .split(' ')
  .filter(function(token) { return token !== 'alpha' })
  .join(' ');

But let's get on with solving your RegExp riddle. In a string "alpha bravo charlie" you want to be able to remove any of the three tokens without leaving behind any unnecessary spaces before, after or between the remaining tokens. This can be done with the help of a negative look-ahead assertion (x(?!y)):

function removeToken(text, token) {
  var pattern = new RegExp('(\\s+(?!\\S+\\s+))?' + token + '\\s*');
  return text.replace(pattern, '');
}

The negative look-ahead assertion (\s(?!\S+\s+))? will only include the space in front of your token, if there is no space after the token. This way you avoid removing both spaces in case you're removing a token in the middle. The expression reads "capture one or more space characters, unless they are followed by one or more non-space characters that are followed by one or more space characters". The "non-space characters" match your token, without having to inject the token in there as well. As these leading spaces are not always there, the capture group is made optional by a trailing ?.

To test this code, we can run all four cases:

var text = 'alpha bravo charlie';
var tests = {
  // <token to remove>: <resulting string>
  'alpha': 'bravo charlie',
  'bravo': 'alpha charlie',
  'charlie': 'alpha bravo',
  'delta': 'alpha bravo charlie',
};

Object.keys(tests).forEach(function(token) {
  var expected = tests[token];
  var result = removeToken(text, token);
  console.log('removed "' + token + '" got "' + result + '" which is', expected === result ? 'correct' : 'WRONG');
});

and that should print

removed "alpha" got "bravo charlie" which is correct
removed "bravo" got "alpha charlie" which is correct
removed "charlie" got "alpha bravo" which is correct
removed "delta" got "alpha bravo charlie" which is correct

If you expect your tokens to contain characters that have a meaning in RegExp, you'd want to escape them.

Upvotes: 2

Wiktor Stribiżew
Wiktor Stribiżew

Reputation: 626738

To match whitespace you need literal \s in the pattern, that means a backslash and s. You defined an optional letter s with "\s?" in new RegExp(classToRemove + "\s?", "g") because in a C string literal one needs 2 backslashes to define a literal backslash.

Use

var re = new RegExp("\\s*" + classToRemove, "g");

Note that "\\s*" (\s*) matches zero or more whitespaces. Since classToRemove cannot contain non-word chars, it does not need regex escaping, so I am not adding that escaping code here.

If there can only be a single occurrence of a class name, remove the "g" global modifier and just use var re = new RegExp("\\s*" + classToRemove).

Upvotes: 1

Related Questions