RJ Trujillo
RJ Trujillo

Reputation: 13

Search a string for all occurrences of a substring in C++

Write a function countMatches that searches the substring in the given string and returns how many times the substring appears in the string.

I've been stuck on this awhile now (6+ hours) and would really appreciate any help I can get. I would really like to understand this better.

int countMatches(string str, string comp)
{
    int small = comp.length();
    int large = str.length();
    int count = 0;

    // If string is empty
    if (small == 0 || large == 0) {
        return -1;
    }

    // Increment i over string length
    for (int i = 0; i < small; i++) {
        // Output substring stored in string
        for (int j = 0; j < large; j++) {
            if (comp.substr(i, small) == str.substr(j, large)) {
                count++;
            }
        }
    }

    cout << count << endl;
    return count;
}

When I call this function from main, with countMatches("Hello", "Hello"); I get the output of 5. Which is completely wrong as it should return 1. I just want to know what I'm doing wrong here so I don't repeat the mistake and actually understand what I am doing.

Upvotes: 0

Views: 2495

Answers (3)

Pete Becker
Pete Becker

Reputation: 76235

The usual approach is to search in place:

std::string::size_type pos = 0;
int count = 0;
for (;;) {
    pos = large.find(small, pos);
    if (pos == std::string::npos)
        break;
    ++count;
    ++pos;
}

That can be tweaked if you're not concerned about overlapping matches (i.e., looking for all occurrences of "ll" in the string "llll", the answer could be 3, which the above algorithm will give, or it could be 2, if you don't allow the next match to overlap the first. To do that, just change ++pos to pos += small.size() to resume the search after the entire preceding match.

Upvotes: 0

Davide Spataro
Davide Spataro

Reputation: 7482

The problem with your function is that you are checking that:

  • Hello is substring of Hello
  • ello is substring of ello
  • llo is substring of llo
  • ...

of course this matches 5 times in this case.


What you really need is:

  • For each position i of str
  • check if the substring of str starting at i and of length = comp.size() is exactly comp.

The following code should do exactly that:

size_t countMatches(const string& str, const string& comp)
{
    size_t count = 0;
    for (int j = 0; j < str.size()-comp.size()+1; j++)
         if (comp == str.substr(j, comp.size()))
              count++;
    return count;
}

Upvotes: 0

RJ Trujillo
RJ Trujillo

Reputation: 13

I figured it out. I did not need a nested for loop because I was only comparing the secondary string to that of the string. It also removed the need to take the substring of the first string. SOOO... For those interested, it should have looked like this:

int countMatches(string str, string comp)
{
    int small = comp.length();
    int large = str.length();
    int count = 0;

    // If string is empty
    if (small == 0 || large == 0) {
        return -1;
    }

    // Increment i over string length
    for (int i = 0; i < large; i++) {
        // Output substring stored in string
        if (comp == str.substr(i, small)) {
            count++;
        }
    }

    cout << count << endl;
    return count;
}

Upvotes: 1

Related Questions