Subash Rajaram
Subash Rajaram

Reputation: 41

Mystery with strcmp output - How strcmp actually compares the strings?

I want to know why strcmp() returns different values if used more than once in the same function. Below is the program. The first case I am aware of why it prints -6. But in the second case, why does it print -1?

#include<stdio.h>
#include<string.h>
int main()
{
    char a[10] = "aa";
    char b[10] = "ag";
    printf("%d\n",strcmp(a, b));
    printf("%d\n",strcmp("aa","ag"));
    return 0;
}

And the output it produces is below

[sxxxx@bhlingxxx test]$ gcc -Wall t51.c
[sxxxx@bhlingxxx test]$ ./a.out
    -6
    -1

Why is the output of second strcmp() -1? Is it the Compiler who plays here? If so What is the exact optimization it does?

Upvotes: 3

Views: 541

Answers (3)

Rishikesh Raje
Rishikesh Raje

Reputation: 8614

from https://linux.die.net/man/3/strcmp

The strcmp() function compares the two strings s1 and s2. It returns an integer less than, equal to, or greater than zero if s1 is found, respectively, to be less than, to match, or be greater than s2.

The strcmp function only promises to return negative value for the comparison given above. The actual value to be returned is not specified.

What has probably happened is that for strcmp("aa","ag") the compiler knows the result is negative and optimises it to -1

Upvotes: 5

Govind Parmar
Govind Parmar

Reputation: 21562

The only thing that the C standard guarantees for strcmp is that the sign of the return value will indicate the direction of the inequality if there is one, or zero if the strings are exactly equal.

While returning the difference between the numeric values of the chars at the first place they differ is a fairly common implementation, it's not required. If the compiler can look at string constants and know right away what the result of strcmp will be, it may add a flat -1, 1, or 0 in its place rather than go through the effort of actually calling the function.

The solution to this is to not write code that relies on a particular implementation of strcmp, no matter how common it may be. Only trust the sign of the return value.

Upvotes: 1

dbush
dbush

Reputation: 224387

The C standard says the following regarding the return value of strcmp:

Section 7.24.4.2p3:

The strcmp function returns an integer greater than, equal to, or less than zero, accordingly as the string pointed to by s1 is greater than, equal to, or less than the string pointed to by s2

So as long as the result fits that description it is compliant with the C standard. That means the compiler can perform optimizations to fit that definition.

If we look at the assembly code:

.loc 1 7 0
leaq    -32(%rbp), %rdx
leaq    -48(%rbp), %rax
movq    %rdx, %rsi
movq    %rax, %rdi
call    strcmp
movl    %eax, %esi
movl    $.LC0, %edi
movl    $0, %eax
call    printf
.loc 1 8 0
movl    $-1, %esi      # result of strcmp is precomputed!
movl    $.LC0, %edi
movl    $0, %eax
call    printf

In the first case, arrays are passed to strcmp to a call to strcmp and a call to printf are generated. In the second case however, string constants are passed to both. The compiler sees this and generates the result itself, optimizing out the actual call to strcmp, and passes the hardcoded value -1 to printf.

Upvotes: 6

Related Questions