Reputation: 197
I'm trying to remove (from a string) only the duplicates that occur sequentially. That is, given the string "1 2 3 3 2 1" only one of the 3's should be removed (i.e. "1 2 3 2 1"). I really thought I had it figured out. And then, during testing, I found a case where it didn't work. I've tried every combination I could think of, to no avail. Surely it's something simple, as it's not a hard match to do (except for me, obviously).
Following is some Javascript to illustrate the problem. The first testVal string is handled correctly. The commented-out testVal string is not handled correctly.
// The following string should reduce to: MTC MTCA MTC ORD MTC (it does).
var testVal = "MTC MTC MTCA MTC MTC MTC ORD MTC";
// The following string should reduce to: MTC (it does not. Result = MTC MTC).
// The string MTC MTC MTC MTC also only reduces to MTC MTC, so I'm thinking
// it's a whitespace issue.
// var testVal = "MTC MTC";
while (/\b(\s*\w+\s*)\b\1/.test(testVal)) {
testVal = testVal.replace(/\b(\s*\w+\s*)\b\1/g,'$1');
}
alert(testVal1);
Upvotes: 1
Views: 333
Reputation: 55392
You are including the whitespace as part of the word to be matched twice. Try
/\b(\w+)\s+\1\b/
Upvotes: 1