Atti
Atti

Reputation: 41

Is my version of the strncmp c function correct?

I am trying to write my version of the function strncmp that already exists in the C language. Below is the solution I have found; I have tested it with a handful of cases and compared the results with the original strncmp C function and I have got same results. Please let me know if there are any flaws in my solution, any errors, etc:

/*
 *  IT IS ENOUGH TO CHECK IF WE REACHED THE LENGTH OF ONE
 *  OF THE STRINGS! :)
 */
int my_strncmp(char *s1, char *s2, unsigned int n)
{
    unsigned int    i;

    i = 0;
    while ((s1[i] == s2[i]) && (s1[i] != 0) && (i < n))
        i++;
    return (s1[i] - s2[i]);
}

#include <stdio.h>
#include <string.h>

int main(void)
{
    char *str1 = "Hello";
    char *barrier = "XXXXXX";
    char *str2 = "Hellz";
    int  n = -1;
    int  max_len = 0;
    int  res;

    max_len = strlen(str1);
    if (strlen(str1) < strlen(str2))
        max_len = strlen(str2);

    // making sure that n is tested way beyond the strlen of strings
    max_len += 2;

    while (n < max_len)
    {
        res = strncmp(str1, str2, n);
        printf("   strncmp(\"%s\",\"%s\", %d) = %d\n", \
            str1, str2, n, res);
        res = my_strncmp(str1, str2, n);
        printf("my_strncmp(\"%s\",\"%s\", %d) = %d\n\n", \
            str1, str2, n, res);
        n++;
    }
    return (0);
}

Upvotes: 1

Views: 222

Answers (2)

RbMm
RbMm

Reputation: 33784

The simplest implementation is:

int my_strncmp(const char *s1, const char *s2, size_t n)
{
    if (n)
    {
        do 
        {
            unsigned char a = *s1++, b = *s2++;

            if (a != b)
            {
                return a < b ? -1 : +1;
            }

            if (!a)
            {
                return 0;
            }
            
        } while (--n);
    }

    return 0;
}

But this will be very slow with long strings compared to an optimized implementation.

It's very important to use unsigned char for comparing. Without unsigned:

/*unsigned*/ char a = *s1++, b = *s2++;

the result will be wrong. (if use a - b as return);

Example:

s1 = "" and s2 = "\xff";

with

char a = *s1++, b = *s2++;

b will 0xff and will be signed extended to 0xffffffff in the expression (a - b) as result will be 0 - 0xffffffff == 1, bit with:

unsigned char a = *s1++, b = *s2++;

the result will be 0 - 0xff == 0xFFFFFF01 (negative).

Simple test:

void test()
{
    char a=0, b=0;
    do 
    {
        do 
        {
            int i = my_strncmp(&a, &b, 1);
            int j = strncmp(&a, &b, 1);

            if ((i ^ j) & 0x80000000)
            {
                __debugbreak();
            }
             
        } while (--b);
    } while (--a);
}

Upvotes: 2

Eric Postpischil
Eric Postpischil

Reputation: 224112

  1. The type of the parameter n and of the counter i should be size_t, not unsigned int.

  2. The type of the first two parameters should be const char *.

  3. Although the strings are passed as pointers to char (with const), the characters should be interpreted as if they had the type unsigned char, per C 2018 7.24.1 3.

  4. s1 and s2 should never be accessed beyond n characters, but the code tests i < n last, after using s1[i] and s2[i], instead of before using them.

  5. The code also accesses s1[i] and s2[i] in the return statement even if the loop terminated because i reached n. Besides causing the routine to access the arrays out of bounds, this also means the return value is very likely to be wrong, as it should be zero but instead will be based on evaluations of the incorrectly accessed s1[i] and s2[i].

More theoretically:

  1. It is conceivable char and unsigned char are the same width as int and therefore s1[i] - s2[i] can overflow or wrap. Instead, the values should be compared and used to select a positive, zero, or negative return value.

Upvotes: 3

Related Questions