Adam S
Adam S

Reputation: 9235

Alphabetic comparison without strcmp()?

I am attempting to write a custom strcmp() function without using the built-in function. So far, my code feels sort of convoluted. Essentially I want the order of characters to be like this:

  1. Special characters (in the order they appear)
  2. Numbers
  3. Alphabetic characters, in order, but capitals first, i.e. "AaBbCcDd"

It is to return 1 if string1 comes before string2, -1 if string2 comes before string1, and 0 if they are equal.

Here is the code I have:

int strcmp(char * string1, char * string2)
{
    while((*string1 != '\0') && (*string2 != '\0') && (*string1 == *string2))
    {
        ++string1;
        ++string2;
    }

    //If both are now zero, they are equal
    if (*string1 == *string2 == '\0') { return 0; }

    //If string1 is comes before, return 1
    //If string2 is comes before, return -1
    int type1 = (isalpha(string1) ? 2 : (isnum(string1) ? 1 : 0))
    int type2 = (isalpha(string2) ? 2 : (isnum(string2) ? 1 : 0))
    return ((type1 < type2) 1 : ((type2 < type1) -1 :
        (((*string1 >= 'a') ? (*string1 - 'a')*2+1 : (*string1 - 'a')*2) < 
        ((*string2 >= 'a') ? (*string2 - 'a')*2+1 : (*string2 - 'a')*2) ? 1 : -1)));
}

There are two things I am not sure about:

  1. Whether assigning "categories" is the right approach. Right now I assign type 0 to special characters, type 1 to numbers, and type 2 to alphabetic characters. This way I can quickly compare types.
  2. Whether my approach of using algebraic operations is appropriate for establishing the character order of alphabetic characters.

Are these good approaches? Are there better? Please keep in mind I am maximizing for efficiency.

Upvotes: 2

Views: 2695

Answers (4)

Kevin Heffernan
Kevin Heffernan

Reputation: 246

Here's my attempt at it. I actually replicate the normal function of strcmp(), so if the strings don't match, it returns the difference between the first element of each string. For example, strcmp("apple","zebra") returns 25 while strcmp("zebra","apple") returns -25

#include <stdio.h>
#include <string.h>

int my_strcmp(char* arg1, char* arg2) {
  while(arg1++ == arg2++);
  return (--arg1==--arg2&&strlen(arg1)==strlen(arg2))?0:arg2[0]-arg1[0];
}

int main(int argc, char* argv[]) {
  printf("%d\n",my_strcmp(argv[1],argv[2]));
}

Upvotes: 0

Edwin Buck
Edwin Buck

Reputation: 70929

try

int strcmp(const char * string1, const char * string2)
{

while (*string1 == *string2++)
    if (*string1++ == 0)
        return (0);
    // then check for the ordering according to taste

}

While the chars are the same, you'll increment s2, then check to see if s1's next char is null, incrementing it as you check. This has the effect of incrementing both pointers while embedding a quick exit if you run to the end of the string. It should pack into assembly quite tightly.

This leaves you with a simplified scenario, where you only need to determine what the next character is in relation to the other

Upvotes: 0

Erik Olson
Erik Olson

Reputation: 1164

Assuming 8 bit chars, you could populate a lookup table. Use your existing compare code to sort a table of all possible char values, then make a table of index numbers for each character.

Then your inner loop only has to look up 1 index number for each char in the string, and compare ints.

#include <stdio.h>

static int my_strcmp_order[256]; // you fill this in

int my_strcmp(const char *s1, const char *s2)
{
        while (*s1 == *s2++) {
                if (*s1++ == '\0') return 0;
        }
        return my_strcmp_order[*(const unsigned char*)s1]
                - my_strcmp_order[*(const unsigned char*)(s2-1)];
}

int main()
{
        for (int i=0; i<256; i++) {
                my_strcmp_order[i] = i; // native sort order - you fill it your way
        }

        const char *s1 = "Abc";
        const char *s2 = "Abcd";
        const char *s3 = "";
        printf("s1 <=> s2 = %d\n", my_strcmp(s1, s2));
        printf("s1 <=> s3 = %d\n", my_strcmp(s1, s3));
        printf("s3 <=> s2 = %d\n", my_strcmp(s3, s2));
}

Upvotes: 2

taskinoor
taskinoor

Reputation: 46037

The obvious problem that I see is the following line.

if (*string1 == *string2 == '\0') { return 0; }

This will not work as expected. This will not return zero if they are equal. If string1 and string2 are equal then *string1 == *string2 is true, or equivalent to non-zero value and thus will never be equal to \0. This condition should be

if ((*string1 == '\0') && (*string2 == '\0')) {}

And do not use ternary operators in this way, as they lead to less less readable code.

Upvotes: 0

Related Questions