Tao Yu
Tao Yu

Reputation: 11

Why results of GCC and Clang are different with following code?

I got different results for the following code with gcc and clang, I believe it is not a serious bug, but I wonder which result is more coherent with the standard? Thanks a lot for your reply.

I use gcc (Ubuntu 7.3.0-27ubuntu1~18.04) 7.3.0 and clang version 6.0.0-1ubuntu2 (tags/RELEASE_600/final)

#include <stdio.h>
int get_1(){
        printf("get_1\n");
        return 1;
}
int get_2(){
        printf("get_2\n");
        return 2;
}
int get_3(){
        printf("get_3\n");
        return 3;
}
int get_4(){
        printf("get_4\n");
        return 4;
}
int main(int argc, char *argv[])
{
        printf("%d\n",get_1() + get_2() - (get_3(), get_4()));
        return 0;
}

the result of gcc is

get_3
get_1
get_2
get_4
-1

and the result of clang is

get_1
get_2
get_3
get_4
-1

Upvotes: 0

Views: 502

Answers (3)

Lundin
Lundin

Reputation: 214810

There's two different but related terms at play: operator precedence and order of evaluation.

Operator precedence dictates parsing order:

  • In your expression, the parenthesis has highest precedence so what's inside it belongs together.

  • Next we have the function call operators (). Nothing strange there, they are postfix and belong to their operator, the function name.

  • Next up we have binary + and - operators. They belong to the same operator group "additive operators" and have the same precedence. When this happens, operator associativity for operators of that group decides in which order they should be parsed.

    For additive operators, the operator associativity is left-to-right. Meaning that the expression is guaranteed to be parsed as (get_1() + get_2()) - ....

  • And finally we have the oddball comma operator, with lowest precedence of all.

Once the operator precedence is sorted out as per above, we know which operands that belong to which operators. But this says nothing of in which order the expression will get executed. That is where order of evaluation comes in.

Generally C says, in dry standard terms:

Except as specified later, side effects and value computations of subexpressions are unsequenced.

In plain English this means that the order of evaluation of operands is unspecified, for the most part, with some special exceptions.

For the additive operators + and -, this is true. Given a + b we cannot know if a or b will get executed first. The order of evaluation is unspecified - the compiler may execute it in any order it pleases, need not document how, and need not even behave consistently from case to case.

This is intentionally left unspecified by the C standard, to allow different compilers to parse expressions differently. Essentially allowing them to keep their expression tree algorithm a compiler trade secret, to allow some compilers to produce more effective code than others on a free market.

And this is why gcc and clang give different results. You have written code that relies on the order of evaluation. This is no fault of either compiler - we should simply not write programs that relies on poorly-specified behavior. If you have to execute those functions in a certain order, you should split them up over several lines/expressions.

As for the comma operator, it is one of the rare special cases. It comes with a built-in "sequence point" which guarantees that the left operand is always evaluated (executed) before the right. Other such special cases are && || operators and the ?: operator.

Upvotes: 2

alinsoar
alinsoar

Reputation: 15813

C does not impose an order in evaluating operands of some operators. The order of evaluation is imposed in C standard by sequence points. When you have sequence points present, a sound implementation of the language must finish evaluating everything at the left of the sequence point before it starts evaluating what is present in the right side. The + and - operators do not contain any sequence point. Here is the very definition from 5.1.2.3 p2

At certain specified points in the execution sequence called sequence points,all side effects of previous evaluations shall be complete and no side effects of subsequent evaluations shall have taken place.

In your expression

get_1() + get_2() - (get_3(), get_4())

you have the +, - and the comma , operator. Only the comma imposes an order of evaluation, the + and - does not.

Upvotes: 8

Colin
Colin

Reputation: 3524

The , between get_3() and get_4() is the only sequence point in printf("%d\n",get_1() + get_2() - (get_3(), get_4())); the get_x calls can happen in any order defined by the compiler as long as get_3() happens before get_4().

You're seeing the result of unspecified behaviour.

Upvotes: 5

Related Questions