codeomnitrix
codeomnitrix

Reputation: 4249

How are library functions are linked in this case?

I just come across this code and the blog says this works fine on 32 bit architecture. I didn't test it; however, I have a doubt about the linkage of libraries in this case. How will the compiler link the string library to main since its not aware which library to link?

So basically if I include <string.h> then it should work fine; however, if I don't include <string.h> then, as per the blog, it runs in 32 bit architecture and fails to run on 64 bit architecture.

#include <errno.h>
#include <stdio.h>

int main(int argc, char *argv[])
{
    FILE *fp;

    fp = fopen(argv[1], "r");
    if (fp == NULL) {
        fprintf(stderr, "%s\n", strerror(errno));
        return errno;
    }

    printf("file exist\n");

    fclose(fp);

    return 0;
}

Upvotes: 3

Views: 4268

Answers (3)

nikolayp
nikolayp

Reputation: 17949

I solved with including strings.h header

#include <string.h>

Upvotes: 2

Jonathan Leffler
Jonathan Leffler

Reputation: 755064

The code shown will only compile if you allow the compiler to infer that functions that are not declared always return an int. This was valid in C89/C90 but marked obsolescent; C99 and C11 require functions to be declared before they are used. GCC prior to version 5.1.0 assumes C90 mode by default; you had to turn the 'reject this code' warnings on. GCC 5.1.0 and onwards assumes C11 by default. You will at least get warnings from the code even without any compilation options to turn them on.

The code will link fine because the function name is strerror() regardless of whether it was declared or not, and the linker can find the function in the standard C library. In general, all the functions that are in the Standard C library are automatically made available for linking — and, indeed, there are usually a lot of not so standard functions also available. C does not have type-safe linkage as C++ does (but C++ also insists on having every function declared before it is used, so the code would not compile as C++ without the header.)

For historical reasons, the maths library was separate and you needed to specify -lm in order to link it. This was in large part because hardware floating point was not universal, so some machines needed a library using the hardware, and other machines needed software emulation of the floating point arithmetic. Some platforms (Linux, for example) still require a separate -lm option if you use functions declared in <math.h> (and probably <tgmath.h>); other platforms (Mac OS X, for example) do not — there is a -lm to satisfy build systems that link it, but the maths functions are in the main C library.

If the code is compiled on a fairly standard 32-bit platform with ILP32 (int, long, pointer all 32-bit), then for many architectures, assuming that strerror() returns an int assumes that it returns the same amount of data as if it returns a char * (which is what strerror() actually returns). So, when the code pushes the return value from strerror() onto the stack for fprintf(), the correct amount of data is pushed.

Note that some architectures (notably the Motorola M680x0 series) would return addresses in an address register (A0) and numbers in a general register (D0), so there would be problems even on those machines with a 32-bit compilation: the compiler would try to get the returned value from the data register instead of the address register, and that was not set by strerror() — leading to chaos.

With a 64-bit architecture (LP64), assuming strerror() returns a 32-bit int means that the compiler will only collect 32-bits of the 64-bit address returned by strerror() and push that on the stack for fprintf() to work with. When it tried to treat the truncated address as valid, things would go awry, often leading to a crash.

When the missing <string.h> header is added, the compiler knows that the strerror() function returns a char * and all is happiness and delight once more, even when the file the program is told to look for doesn't exist.

If you are wise, you will ensure your compiler is always compiling in fussy mode, rejecting anything which is plausibly erroneous. When I use my default compilation on your code, I get:

$ gcc -std=c11 -O3 -g -Wall -Wextra -Werror -Wmissing-prototypes \
>      -Wstrict-prototypes -Wold-style-definition bogus.c -o bogus
bogus.c: In function ‘main’:
bogus.c:10:33: error: implicit declaration of function ‘strerror’ [-Werror=implicit-function-declaration]
         fprintf(stderr, "%s\n", strerror(errno));
                                 ^
bogus.c:10:25: error: format ‘%s’ expects argument of type ‘char *’, but argument 3 has type ‘int’ [-Werror=format=]
         fprintf(stderr, "%s\n", strerror(errno));
                         ^
bogus.c:10:25: error: format ‘%s’ expects argument of type ‘char *’, but argument 3 has type ‘int’ [-Werror=format=]
bogus.c:4:14: error: unused parameter ‘argc’ [-Werror=unused-parameter]
 int main(int argc, char *argv[])
              ^
cc1: all warnings being treated as errors
$

The 'unused argument' error reminds you that you should be checking that there is an argument to pass to fopen() before you try to open the file.

Fixed code:

#include <string.h>
#include <errno.h>
#include <stdio.h>

int main(int argc, char *argv[])
{
    FILE *fp;

    if (argc != 2)
    {
        fprintf(stderr, "Usage: %s file\n", argv[0]);
        return 1;
    }

    fp = fopen(argv[1], "r");
    if (fp == NULL)
    {
        fprintf(stderr, "%s: file %s could not be opened for reading: %s\n",
                argv[0], argv[1], strerror(errno));
        return errno;
    }

    printf("file %s exists\n", argv[1]);

    fclose(fp);

    return 0;
}

Build:

$ gcc -std=c11 -O3 -g -Wall -Wextra -Werror -Wmissing-prototypes \
>     -Wstrict-prototypes -Wold-style-definition bogus.c -o bogus  
$

Run:

$ ./bogus bogus
file bogus exists
$ ./bogus bogus2
./bogus: file bogus2 could not be opened for reading: No such file or directory
$ ./bogus
Usage: ./bogus file
$

Note that the error messages include the program name and report to standard error. When the file is known, the error message includes the file name; it is much easier to debug that error if the program is in a shell script than if the message is just:

No such file or directory

with no indication of which program or which file encountered the problem.

When I remove the #include <string.h> line from the fixed code shown, then I can compile it and run it like this:

$ gcc -o bogus90 bogus.c
bogus.c: In function ‘main’:
bogus.c:18:35: warning: implicit declaration of function ‘strerror’ [-Wimplicit-function-declaration]
                 argv[0], argv[1], strerror(errno));
                                   ^
$ gcc -std=c90 -o bogus90 bogus.c
$ ./bogus90 bogus11
Segmentation fault: 11
$

This was tested with GCC 5.1.0 on Mac OS X 10.10.5 — which is, of course, a 64-bit platform.

Upvotes: 5

Anatoli P
Anatoli P

Reputation: 4891

I don't think the functionality of this code would be affected by whether its 32-bit or 64-bit architecture: it doesn't matter if pointers are 32- or 64-bit, and if long int is 32 or 64 bit. Inclusion of headers, in this case string.h, should not affect linking to libraries, either. Header inclusion matters to the compiler, not linker. The compiler might warn about the function being implicitly declared, but as long as the linker can find the function in one of the libraries being searched by it, it will successfully link the binary, and it should run just fine.

I just built and ran this code successfully on a 64-bit CentOS box, using clang 3.6.2. I did get this compiler warning:

junk.c:10:33: warning: implicitly declaring library function 'strerror' with type 'char *(int)'
        fprintf(stderr, "%s\n", strerror(errno));
                                ^
junk.c:10:33: note: include the header <string.h> or explicitly provide a declaration for 'strerror'
1 warning generated.

The program was given a non-existent file name, and the error message, "No such file or directory," was meaningful. However, this is because the strerror() function is a well-known standard library function, and its declaration was correctly guessed by the compiler. If it is a user-defined function, the compiler may not be so "lucky" at guessing, and then the architecture can matter, as suggested by other answers.

So, the lesson learned: make sure function declarations are available to the compiler and heed the warnings!

Upvotes: 1

Related Questions