José
José

Reputation: 364

About C compilation process and the linking of libraries

Are C libraries linked with object code or first with source code so only later with object code? I mean, look at the image found at Cardiff School of Computer Science & Informatics's website :

Compile process

It's "strange" that after generating object-code the libraries are being linked. I mean, we use the source code while putting the includes!

So.. How this actually works? Thanks!

Upvotes: 0

Views: 2561

Answers (4)

thurizas
thurizas

Reputation: 2528

It might be instructive to look at what each piece in the tool-chain does, so using the boxes in your image.

pre-processor

This is really a text-editor doing a bunch of substitutions (ok, really really oversimplified). Some of the things that the pre-processor does is:

  • performs simple textual based substitution on #defines. So if we have #define PI 3.1415 in our file and then later on we have a line such as angle = angle * PI / 180; the pre=processor will convert this line into angle = angle * 3.1414 / 180;
  • anytime we encounter an #include, we can imagine that the pre-processor goes and gets the entire contents of that file and pastes the contents on the file on to where the #include is. (and then we go back and perform the substitutions.
  • we can also pass options to the compiler with the #pragma directive.

Finally, we can see the results of running the pre-processor by using the -E option to gcc.

compiler

The output of the pre-processor is still text, and it not contains everything that the compiler needs to be able to process the file. Now the compiler does a lot of things (and I normally break the box up when I describe this process). The compiler will process the text, do a lexical analysis of it, pass it to the parser that verifies that the program satisfies the grammar of the language, output an intermediate representation of the language, perform optimization and produce assembly code.

We can see the results of running up to the assembler by using the -s option to gcc.

assembler

The output of the compiler is an assembly listing, which is then passed to an assembler (most commonly `gas' (GNU assembler) on Linux), that converts the assembly code into machine code. In addition, on task of the assembler is to build a list of undefined referenced (i.e. a library function of a function that you wrote that is implemented in another source file.)

We can see the results of getting the output of the assembler by using the -c option to gcc.

linker

The input to the linker will be the output from the assembler (typically called object files and use an extention 'o'), as well as various libraries. Conceptually, the linker is responsible for hooking everything together, including fixing up the calls to functions that are found in libraries. Normally, the program that performs the linking in Linux is ld, and we can see the results of linking just by running gcc without any special command line options.

I have simplified the discussion of the linker, I hope I gave you a flavor of what the linker does.


The only issue that I have with the image you referenced, is that I would have move the phase "Object Code" to immediately below the assembler box, and at the same time I would move the arrow labeled "Libraries" down. I feel that this would indicate that the object code from the assembler is combined with libraries and these are combined by the linker to make an executable.

Upvotes: 3

Aidan Medcalf
Aidan Medcalf

Reputation: 135

That diagram is correct.

When you #include a header, it essentially copies that header into your file. A header is a list of types and function declarations and constants, etc., but doesn't contain any actual code (C++ and inline functions notwithstanding).

Let's have an example: library.h

int foo(int num);

library.c

int foo(int num)
{
    return num * 2;
}

yourcode.c

#include <stdio.h>
#include "library.h"
int main(void)
{
    printf("%d\n", foo(100));
    return 0;
}

When you #include library.h, you get the declaration of foo(). The compiler, at this point, knows nothing else about foo() or what it does. The compiler can freely insert calls to foo() despite this. The linker, seeing a call to foo() in youcode.c, and seeing the code in library.c, knows that any calls to foo() should go to that code.

In other words, the compiler tells the linker what function to call, and the linker (given all the object code) knows where that function actually is.

(Thanks to Fiddling Bits for corrections.)

Upvotes: 3

Wojtek Surowka
Wojtek Surowka

Reputation: 21013

Includes from libraries normally contain only library interface - so in the simplest case the .h file provided with the library contains function declaration, and the compiled function is in the library file. So you compile the sources with provided library functions declarations from library headers, and then linker adds the compiler library functions to your executable.

Upvotes: 3

Related Questions