Quentin
Quentin

Reputation: 1086

Different declarations of the same function/global variable in two files

I have 2 questions regarding different declarations of the same function and global variable in two files in case of C and C++ as well.

  1. Different function declarations

    Consider the following code fragments:

    file_1.c

    void foo(int a);
    
    int main(void)
    {
        foo('A');
    }
    

    file_2.c

    #include <stdio.h>
    
    void foo(char a)
    {
        printf("%c", a); //prints 'A' (gcc)
    }
    

    As we can see, the prototype differs from the definition located in file_2.c, however, the function prints expected value.

    If it comes to C++, the above program is invalid due to undefined reference to foo(int) at link time. It's probably caused by presence of other function signatures - in comparison with C, where a function name doesn't contain any extra characters indicating the type of function arguments.

    But when it comes to C then what? Since the prototypes with the same name have the same signature regardless of the number of arguments and its types, linker won't issue an error. But which type conversions are performed in here? Does it look like this: 'A' -> int -> back to char? Or maybe this behavior is undefined/implementation-defined ?

  2. Different declarations of a global variable

    We've got two files and two different declarations of the same global variable:

    file_1.c

    #include <stdio.h>
    
    extern int a;
    
    int main(void)
    {
        printf("%d", a); //prints 65 (g++ and gcc)
    }
    

    file_2.c

    char a = 'A';
    

    Both in C and C++ the output is 65.

    Though I'd like to know what both standards say about that kind of situation.

    In the C11 standard I've found the following fragment:

    J.5.11 Multiple external definitions (Annex J.5 Common extensions)
    There may be more than one external definition for the identifier of an object, with or without the explicit use of the keyword extern; if the definitions disagree, or more than one is initialized, the behavior is undefined (6.9.2).

    Notice that it refers to presence of two and more definitions, in my code there is only one, so I'm not sure whether this article is a good point of reference in this case...

Upvotes: 6

Views: 1026

Answers (4)

Quentin
Quentin

Reputation: 1086

Just so you know, I've accidentally found the paragraph in C11 standard that covers both issues - it's 6.2.7.2:

All declarations that refer to the same object or function shall have compatible type; otherwise, the behavior is undefined.

Upvotes: 1

Rsh
Rsh

Reputation: 7752

Overloaded functions in C++ work because the compiler encodes each unique method and parameter list combination into a unique name for the linker. This encoding process is called mangling, and the inverse process demangling.

But there is no such thing in C. When the compiler encounters a symbol (either a variable or function name) that is not defined in the current module, it assumes that it is defined in some other module, generates a linker symbol table entry, and leaves it for the linker to handle. In here we have no parameter checking.

And also if there is no type conversion in here. In main, you send a value to foo. Here it's assembly code :

movl    $65, (%esp)
call    foo

And foo reads it by taking it away from stack. Since it's input value defined as char, It store the input value in al register ( one byte ):

movb    %al, -4(%ebp)

So for given inputs greater than 256, you will see variable a in foo, circulates over 256.

About your second question, In C symbols for initialized variables and functions are defined as strong and multiple strong symbbols are not allowed, but I not sure whether is it the case with C++ or not.

Upvotes: 1

Sergey Kalinichenko
Sergey Kalinichenko

Reputation: 726599

Q1. According to C99 specification, section 6.5.2.2.9, it is an undefined behavior in C:

If the function is defined with a type that is not compatible with the type (of the expression) pointed to by the expression that denotes the called function, the behavior is undefined.

The expression "points to" a function taking an int, while the function is defined as taking a char.

Q2. The case with variables is also undefined behavior, because you are reading or assigning an int to/from char. Assuming 4-byte integers, this will access three bytes past the memory location where it is valid. You can test this by declaring more variables, like this:

char a = 'A';
char b = 'B';
char c = 'C';
char d = 'D';

Upvotes: 5

Christian Stieber
Christian Stieber

Reputation: 12496

That's why you put declarations into headers, so even a C compiler can catch the problem.

1)

The results of this is pretty much random; in your case, the "char" parameter might be passed as an int (like in a register, or even on the stack to keep alignment, or whatever). Or you got lucky due to endianess, which keeps the lowest order byte first.

2)

Likely to be a lucky outcome due to endianess and some added '0' bytes to fill up the segment. Again, don't rely on it.

Upvotes: 2

Related Questions