Reputation: 95

Why does this function fail when I use this regular expression?

I have been practicing getting more comfortable with the use of regular expressions, but I am having a really hard time understanding why this function I wrote does not work. I wrote a simple function to count the number of duplicate letters in a word, which seems to work sometimes, but doesn't work all the time.

function duplicates(str){
    try{
        return str.match(/(.)\1+/ig).length;
    }catch(e){
        return 0;
    }
}

According to what I have researched this statement should look through the string, find a letter (or multiple letters) that repeat more than once ignoring case, and return the length of the matched letters. If no matching letters occur it will return 0. It works correctly for some strings, but not all. Here's what I have been getting:

duplicates("abcdef") -> 0      #should return 0
duplicates("Aabccdef") -> 2    #should return 2
duplicates("Mississippi") -> 3 #should return 3
duplicates("Indivisible") -> 0 #should return 1
duplicates("abcabcabc") -> 0   #should return 3

Upon further inspection, it would seem that when I ran "Mississippi", I got the expected number 3, however when I added .toString() in replacement of .length to see what the expression was counting I got:

ss,ss,pp

There are no i's counted and there should be. It would also seem that the i's were not counted in "Indivisible" either and did not indicate any repeated letters in "abcabcabc". It seems that it can not count non-consecutive repeats, but I can't figure out why. I am sure it is my misunderstanding of how regular expressions work, since I am new to them, but if anyone would shed some light on why this is happening, that would be awesome!

Edit: Is there a way to do this with RegEx, or do I need to use a loop?

Upvotes: 3

Answers (4)

Tom Wyllie

Reputation: 2085

In terms of the actual Regular expression that you've posted, there are a few problems with it. The reason that (.)\1+ doesn't work is that the 'first match' (\1) immediately follows the matching group with the .. This means in the case of 'Mississippi', since there are no consecutively matched letter 'i's, your pattern does not match them.

As an alternative solution to this problem, you might as well keep it simple. A more reasonable solution for your use case would be to simply loop through and count each letter.

function duplicates(str){
    try{
        let letters = str.toLowerCase().split('');
        let countedLetters = {}
        for(let i = 0; i < letters.length; i++) {
            countedLetters[letters[i]] = countedLetters[letters[i]] + 1 || 1;
        }
        return countedLetters;
    } catch(e) {
        return 0;
    }
}

console.log(duplicates('Mississippi'));

Upvotes: 2

revo

Reputation: 48761

You are near to achieve it but since you are looking for recent captured character immediately after capturing it you can't count characters which are not neighbors.

The idea would be using a positive lookahead to find repeating characters, then omit duplicate characters to leave unique characters in order to count them. The regex:

(.)(?=.*\1)

ES6:

function duplicates($str) {
    return [...new Set($str.toLowerCase().match(/(.)(?=.*\1)/g))].length;
}

console.log(duplicates("abcdef"));
console.log(duplicates("Aabccdef"));
console.log(duplicates("Mississippi"));
console.log(duplicates("Indivisible"));
console.log(duplicates("abcabcabc"));

ES5:

function _unique(value, index, self) { 
    return self.indexOf(value) === index;
}

function duplicates($str) {
    return ($str.toLowerCase().match(/(.)(?=.*\1)/g) || Array()).filter(_unique).length;
}

console.log(duplicates("abcdef"));
console.log(duplicates("Aabccdef"));
console.log(duplicates("Mississippi"));
console.log(duplicates("Indivisible"));
console.log(duplicates("abcabcabc"));

Upvotes: 2

qiAlex

Reputation: 4356

I did not investigate duplicates in 'Indivisible'.

It is not the fully answer, because the 'abcabcabc' string may be divided into the subpatterns like 'abcabc' that starts from 0 character and 'abcabc' that start from the 3 character.

It is not the fully answer, but I hope that this will be usefull

'olololo'.match(/(.+)(?=(\1))/ig)

Upvotes: -1

vijaykrishnavanshi

Reputation: 813

Your regex takes the any letter lets say X from you word and back references one letter to check if it was the same or not. You can use this visualizer to understand what it is doing. It counts the occurrences of characters that are immediately followed by the first occurrence.

Link with your query : https://regexper.com/#%2F(.)%5C1%2B%2Fig

Link to the website: https://regexper.com/

Upvotes: 0

Why does this function fail when I use this regular expression?

Answers (4)

ES6:

ES5:

Related Questions