user5870683
user5870683

Reputation:

Function to count the amount of times two string match in C

I'm trying to find how may times two string matches in c, if we have two or more stars, multiple string combination can be suitable. e.g "abcdb" & "*b*" matches two times. my current code works but it returns four. I don't what I am missing here.

#include <stdio.h>

int nmatch(char *s1, char *s2) {
    if (*s1 != '\0' && *s2 != '\0' && *s2 == '*' && (*s2 + 1) != *s1) {
        return nmatch(s1 + 1, s2) + 1;
    }
    if (*s1 == *s2) {
        return nmatch(s1 + 1, s2 + 1);
    }
    if (*s1 == *s2 && *s1 == '\0' && *s2 == '\0') {
        return 0;
    }
    return (1);
}

int main() {
    char ap[] = "abcd";
    char ab[] = "*b*";
    printf("%d", nmatch(ap, ab));
    return 0;
}

Upvotes: 0

Views: 1729

Answers (2)

Daniel Jour
Daniel Jour

Reputation: 16156

I think your algorithm is just wrong. Together with the flawed implementation this will lead no where.

There's the input string and the pattern string. If the first character of the pattern is not a * and is not equal to the first character of the input string, then return 0 (it's a mismatch).

If they're equal, then remove the first character of both the input and the pattern and return the number of matches of the reduced input and pattern.

On the other hand, if the first character of the pattern is a *, then (in a loop) sum the number of matches of the remaining pattern with the complete input string, the input string without the first, then without the second character ... and so on until the input string is empty.

If both input and pattern are empty: return 1. If only one of the strings is empty: return 0.


I implemented it as direct recursive function (for production use that should be changed into an iterative version):

unsigned matches(char const * const input, char const * const pattern) {
  if (! *pattern) {
    // An empty pattern matches only the empty
    // string, nothing else
    return (*input) ? 0 : 1;
  }
  if (*pattern == '*') {
    // a wildcard could be zero to any number
    // of characters. Sum the number of matches
    // for each possibility (excluding the empty
    // rest of input for now)
    char const * rest = input;
    unsigned count = 0;
    while (*rest) {
      count += matches(rest, pattern + 1);
      ++rest;
    }
    // Add matches for the empty rest of input
    count += matches(rest, pattern + 1);
    return count;
  }
  if (! *input) {
    // A non empty, non wildcard pattern cannot
    // match the empty string
    return 0;
  }
  if (*pattern == *input) {
    // non wildcard match, count the ways the rest of
    // the pattern matches the rest of the input.
    return matches(input + 1, pattern + 1);
  }
  else {
    // mismatch!
    return 0;
  }
}

(Live with tests)

To deal with the combinatory explosion of possible matches when there are multiple adjacent wildcards in the pattern one could remove these from the pattern first.

Upvotes: 0

chqrlie
chqrlie

Reputation: 144780

Your code just does not count the number of different ways s1 matches pattern s2. It does not even return 1 for identical strings.

The first comparison (*s2 + 1) != *s1 is incorrect, you probably meant *(s2 + 1) != *s1 equivalent to s2[1] != *s1, but this fix is not enough to correct the algorithm.

Here is a naive implementation that does:

int nmatch(const char *s1, const char *s2) {
    int count;
    while (*s2 != '\0' && *s2 != '*' && *s1 == *s2) {
        s1++;
        s2++;
    }
    if (*s2 == '\0')
        return *s1 == '\0';
    if (*s2 != '*')
        return 0;
    while (*s2 == '*')    /* skip all stars */
        s2++;
    if (*s2 == '\0')
        return 1;
    for (count = 0;; s1++) {
        count += nmatch(s1, s2);
        if (*s1 == '\0')
            break;
    }
    return count;
}

Upvotes: 1

Related Questions