Reputation: 4031
For clarity I'm only talking about null terminated strings.
I'm familiar with the standard way of doing string comparisons in C with the usage of strcmp. But I feel like it's slow and inefficient.
I'm not necessarily looking for the easiest method but the most efficient.
Can the current comparison method (strcmp) be optimized further while the underlying code remains cross platform?
If strcmp can't be optimized further, what is the fastest way which I could perform the string comparison without strcmp?
Current use case:
Reference to current strcmp() implementation:
Edit: Clarified the solution does not need to be a modification of strcmp.
Edit 2: Added specific examples for this use case.
Upvotes: 0
Views: 2873
Reputation: 144740
I'm afraid your reference imlementation for strcmp()
is both inaccurate and irrelevant:
it is inaccurate because it compares characters using the char
type instead of the unsigned char
type as specified in the C11 Standard:
7.24.4 Comparison functions
The sign of a nonzero value returned by the comparison functions
memcmp
,strcmp
, andstrncmp
is determined by the sign of the difference between the values of the first pair of characters (both interpreted asunsigned char
) that differ in the objects being compared.
It is irrelevant because the actual implementation used by modern compilers is much more sophisticated, expanded inline using hand-coded assembly language.
Any generic implementation is likely to be less optimal, especially if coded to remain portable across platforms.
Here are a few directions to explore if your program's bottleneck is comparing strings.
memcmp()
instead of strcmp()
. memcmp()
is simpler than strcmp()
and can be implemented even more efficiently in places where the strings are known to be properly aligned.EDIT: with the extra information provided, you could use a structure like this for your strings:
typedef struct string_t {
size_t len;
size_t hash; // optional
char str[]; // flexible array, use [1] for pre-c99 compilers
} string_t;
You allocate this structure this way:
string_t *create_str(const char *s) {
size_t len = strlen(s);
string_t *str = malloc(sizeof(*str) + len + 1;
str->len = len;
str->hash = hash_str(s, len);
memcpy(str->str, s, len + 1);
return str;
}
If you can use these str things for all your strings, you can greatly improve the efficiency of the matching by first comparing the lengths or the hashes. You can still pass the str
member to your library function, it is properly null terminated.
Upvotes: 4