Reputation:

Regex to remove all characters that are repeated

I'm looking for a regex that will remove all characters that have been repeated in a string. I already solved this using a loop. Just wondering if there is a regex that can do the same.

this is what i have so far:

function onlyUnique(str) {
  var re = /(.)(?=.*\1)/g
  return str.replace(re, '');
}

This string:

"rc iauauc!gcusa_usdiscgaesracg"

should end up as this:

" !_de"

Upvotes: 5

Answers (5)

bengrin

Reputation: 26

function onlyUnique(str) {
  // match the characters you want to remove
  var match = str.match(/(.)(?=.*\1)/g);
  if (match) {
    // build your regex pattern
    match = '[' + match.join('') + ']';
  }
  // if string is already unique return the string
  else {
    return str
  }
  // create a regex with the characters you want to remove      
  var re = new RegExp(match, 'g');
  return str.replace(re, '');
}

Upvotes: 0

Wiktor Stribiżew

Reputation: 626845

If you want to do it with a regex, you can use your own regex with a callback function inside a replace.

var re = /(.)(?=.*\1)/g; 
var str = 'rc iauauc!gcusa_usdiscgaesracg';
var result = str;
str.replace(re, function(m, g1) {
    result = result.replace(RegExp(g1.replace(/[.*+?^${}()|[\]\\]/g, "\\$&"), "g"), '');
});
document.getElementById("r").innerHTML = "'" + result + "'";

<div id="r"/>

The idea is: get the duplicated character, and remove it from the input string. Note that escaping is necessary if the character might be a special regex metacharacter (thus, g1.replace(/[.*+?^${}()|[\]\\]/g, "\\$&") is used).

Another idea belongs to Washington Guedes in his deleted answer, I just add my own implementation here (with removing duplicate symbols from the character class and escaping special regex chars):

var s = "rc iauauc!gcusa_u]sdiscgaesracg]";
var delimiters= '[' + s.match(/(.)(?=.*\1)/g).filter(function(value, index, self) { // find all repeating chars
    return self.indexOf(value) === index;  // get unique values only
}).join('').replace(/[.*+?^${}()|[\]\\]/g, "\\$&") + ']'; // escape special chars
var regex = new RegExp(delimiters, 'g'); // build the global regex from the delimiters
var result = s.replace(regex, '');  // obtain the result
document.getElementById("r2").innerHTML = "'" + result + "'";

<div id="r2"/>

NOTE: if you want to support newline symbols as well, replace . with [^] or [\s\S] inside the regex pattern.

Upvotes: 1

Oriol

Reputation: 288120

Your regex searches pairs of duplicated characters and only removes the first one. Therefore, the latest duplicate won't be removed.

To address this problem, you should remove all duplicates simultaneously, but I don't think you can do this with a single replace.

Instead, I would build a map which counts the occurrences of each character, and then iterate the string again, pushing the characters that appeared only once to a new string:

function onlyUnique(str) {
  var map = Object.create(null);
  for(var i=0; i<str.length; ++i)
    map[str[i]] = (map[str[i]] || 0) + 1;
  var chars = [];
  for(var i=0; i<str.length; ++i)
    if(map[str[i]] === 1)
      chars.push(str[i]);
  return chars.join('');
}

Unlike indexOf, searches in the hash map are constant on average. So the cost of a call with a string of n characters will be n.

Upvotes: 1

Tushar

Reputation: 87203

You can use Array#filter with Array#indexOf and Array#lastIndexOf to check if the element is repeated.

var str = "rc iauauc!gcusa_usdiscgaesracg";

// Split to get array
var arr = str.split('');

// Filter splitted array
str = arr.filter(function (e) {
    // If index and lastIndex are equal, the element is not repeated
    return arr.indexOf(e) === arr.lastIndexOf(e);
}).join(''); // Join to get string from array

console.log(str);
document.write(str);

Upvotes: 4

Sudhir Bastakoti

Reputation: 100175

well, no idea if regex can do that, but you could work it out using for loop, like:

function unikChars(str) {
    store = [];
    for (var a = 0, len = str.length; a < len; a++) {
        var ch = str.charAt(a);
        if (str.indexOf(ch) == a && str.indexOf(ch, a + 1) == -1) {
            store.push(ch);
        }
    }
    return store.join("");
}

var str = 'rc iauauc!gcusa_usdiscgaesracg';
console.log(unikChars(str)); //gives  !_de

Demo:: jsFiddle

Upvotes: 1

Regex to remove all characters that are repeated

Answers (5)

Related Questions