Sean
Sean

Reputation: 879

How does a C compiler convert a constant to binary

For the sake of specifics, let's consider GCC compiler, the latest version.

Consider the instruction int i = 7;.

In assembly it will be something like

MOV 7, R1

This will insert the value seven to register R1. The exact instruction may not be important here.

In my understanding, now the compiler will convert the MOV instruction to processor specific OPCODE. Then it will allocate a (possibly virtual) register. Then the constant value 7 needs to go in the register.

My question:

How does the 7 is actually converted to binary?

Does the compiler actually repeatedly divide by 2 to get the binary representation? (May be afterwards it will convert to HEX, but let's remain on the binary step).

Or, considering that the 7 is written as a character in a text file, is there a clever look up table based technique to convert any string (representing a number) to a binary value?

If the current GCC compiler uses built in function to convert a string 7 to a binary 0111, then how did the first compiler convert a text based string to a binary value?

Thank you.

Upvotes: 1

Views: 749

Answers (2)

Osman
Osman

Reputation: 76

If the current GCC compiler uses built in function to convert a string 7 to a binary 0111, then how did the first compiler convert a text based string to a binary value? This is egg chicken problem but to simply put these compilers are created step by step and at some point the compiler is written in its language such that c compiler is written by c etc.

Before to answer your question we should define what we mean by "compilation" or what compiler does. to simply put this compilation is a pipeline. Takes your high level code does some operations and generates an assembly code(specific to machine) and machine defined assembler takes your assembly code and converts it into a binary object file.
At the compiler level all they do is to create corresponding assembly format in a text file. enter image description here

and assembler is another program that takes this text file and converts it into "binary" format.
Assembler can be also written by c language here we also need a mapping i.e movl->(0000110101110...) but this one is binary not ascii. and we need to write this binary into a file as-is.
Converting numbers into binary format is also redundant because numbers are already in binary form when they are loaded into memory.
the question is how they are converted/placed in to memory is a problem of the loader program of the operating system which exceeds my knowledge.

Upvotes: 0

Lundin
Lundin

Reputation: 213822

How does the 7 is actually converted to binary?

First of all, there's a distinction between the binary base 2 number format and what professional programmers call "a binary executable", meaning generated machine code and most often expressed in hex for convenience. Addressing the latter meaning:

Disassemble with binaries (for example at https://godbolt.org/) and see for yourself

int main (void)
{
  int i = 7;
  return i;
}

Does indeed get translated to something like

mov    eax,0x7
ret  

Translated to binary op codes:

B8 07 00 00 00
C3

Where B8 = mov eax, B9 = mov ecx and so on. The 7 is translated into 07 00 00 00 since mov expects 4 bytes and this is a little endian CPU.

And this is the point where the compiler/linker stops caring. The code was generated according to the CPU's ABI (Application Binary Interface) and how to deal with this machine code from here on is up to the CPU.

As for how this makes it into the hardware in the actual form of base 2 binary... it's already in that form. Everything we see in a PC is a translated convenience for the human users, who have an easier time reading decimal or hex than raw binary.

Upvotes: 2

Related Questions