Reputation: 8433
I am not convinced by the definition of strcmp()
, it says it will compare entire strings (the two strings passed in as parameters, and returns -1
, 0
, 1
based on s1<s2
, s1
(same as) s2
, s1>s2
.
If so, how do you justify below snippet from K&R, after very first non-equal character encounter, you exit out of loop. How are we comparing all characters of s1
and s2
until end ('\0'
) ?
char *p1 = &str1[0], *p2 = &str2[0];
while (1) {
if (*p1 != *p2)
return *p1 - *p2;
if (*p1 == '\0' || *p2 == '\0')
return 0;
p1++;
p2++;
}
Upvotes: 1
Views: 1088
Reputation: 144951
Most implementations of strcmp()
return as soon as a character is different in both strings. Hence it does not always compare the whole strings, because it is only necessary for identical strings.
Note that strcmp()
does not necessarily return -1
nor 1
if the strings differ... only the sign of the return value is relevant to determine whether s1
is lexicographically lesser or greater than s2
.
Also worth mentioning is the fact that comparison of char
elements is performed on their value converted to (unsigned char)
, making the result independent of the compiler setting for char
signedness. The same is true for memcmp
.
Here is a sample (and simple) implementation for architectures with sizeof(char) < sizeof(int)
:
int strcmp(const char *s1, const char *s2) {
while (*s1 == *s2 && *s1 != '\0') {
s1++;
s2++;
}
return (unsigned char)*s1 - (unsigned char)*s2;
}
Upvotes: 0
Reputation: 57418
strcmp()
will halt as soon as the first non-equal character is found, since that's enough to determine lexicographic order of any two words.
This is the norm for "classic" strcmp()
(1)
It follows that by carefully timing the result of a strcmp()
of a known string against an unknown string (e.g. a password), one might be able to determine which character actually failed the comparison, which would then allow to guess any character using at most 25*N (or 255*N if you use the whole ASCII set, excluding zero) attempts. This is known as a timing attack.
So there are also implementations of secure strcmp()
which have the property of requiring a time proportional to the length of the first string being supplied. You might conceivably have encountered a reference to such.
(1) other text-comparing functions exist that may handle things differently, e.g. with wide and multibyte characters, where sometimes it may be desirable for two different characters to compare the same - say, "a" and "à"; yet it could be argued that we're no longer speaking about simple comparison but about the more complex collation).
Upvotes: 3
Reputation: 320631
strcmp
performs classic lexicographical comparison of two strings. The result of lexicographical comparison is already immediately known the very moment you find the first non-matching character. That's exactly what you see in the implementation.
Upvotes: 0