Andreas Haferburg
Andreas Haferburg

Reputation: 5510

Is the linker allowed to ignore that two struct definitions are not equal?

If you have two translation units that use the same struct A, you typically place that struct in a header, i. e. let the preprocessor paste the definitions into both TUs.

So let's say we don't do that. Are we allowed to use different definitions? Why?

createA.c:

typedef struct {
  int data;
} A;

A createA() {
  A a = {10};
  return a;
}

main.c:

#include <stdio.h>

typedef struct {
  int value;
  float x;
} A;

A createA();

int main(int argc, char *argv[])
{
  A a = createA();
  printf("value is %d, x is %f\n", a.value, a.x);
  a.x = 3.1f;
  printf("value is %d, x is %f\n", a.value, a.x);
}

VS 10 appears to happily link this together, even though the size of the two structs is different. Which means that a call to createA works differently in the two TUs (different stack modifications). Which is a bit weird. When I add arguments to the function, the program crashes at runtime, but still no linker error.

So, why is this interesting? I was wondering if it isn't possible to use this mechanism for a Pimpl variant where instead of impl* impl_data you would use something like unsigned char impl_data[impl_size] in the public header, and cast to the actual impl struct in the implementation file. Practical considerations aside (e.g. determining impl_size), is this allowed?

Upvotes: 3

Views: 172

Answers (3)

rici
rici

Reputation: 241711

The linker has no idea how structs are defined. Or even that structs exist. All the linker cares about is resolving names with external linkage, and structs are compile-time artefacts.

The linker doesn't know about function prototypes either. So it has no idea whether the arguments to a function called in some translation unit actually correspond to the arguments expected by that function compiled into a different translation unit. All the linker will do is make sure that the call invokes the function with that name.

So the linker is certainly allowed to ignore a number of errors. But that does not make the errors correct. They are still errors. It is your responsibility as a programmer to ensure that your programs are correct, and that structure definitions and prototypes in one translation unit are compatible with those in another translation unit. C will not act as your guardian angel.

That may not seem totally user-friendly, and in many ways it isn't. There are languages which are designed to detect errors like that. But C does not have that philosophy.

With respect to the concrete example of using struct s { char _[sizeof_struct_s]; }; in one translation unit and the real struct s in another one, the result is Undefined Behaviour for which no error message is required. So it should not be used in conformant portable code. Nonetheless, if you can guarantee appropriate alignment of the character array and you can correctly compute the size (which will be a big maintainability headache), it will probably work on many architectures. Of course, there are no guarantees, and if it breaks, you get to keep the pieces.

Upvotes: 4

Steve Summit
Steve Summit

Reputation: 47942

The linker isn't really involved here. If you've got a function f() that's defined in one translation unit and called from another, and if it's declared with different types in the two translation units, that's (a) a pretty serious problem and (b) not one that traditional linkers (where by "traditional" I mean the relatively simpleminded ones C was designed around) can detect. Having different declarations of the struct type where the function is defined versus where it's called isn't really any different from defining a function that returns a double in one translation unit but then declaring and calling it as if it returns an int in another. (My point is that you don't tend to get a complaint from the linker there, either.)

Upvotes: 1

fuz
fuz

Reputation: 92984

If two translation units use different declarations of the same structure type, that's fine and in fact not that uncommon.

If an external symbol is defined in two translation units with different signatures (where the same signature with differently defined parameter types is a different signature), behaviour is undefined.

Note that this way, two parts of a program can have a different declaration of the same structure type, but they have no way of finding out this discrepancy as the two definitions can never get in contact with one-another without creating undefined behaviour.

Does that answer your question?

Upvotes: 1

Related Questions