oxagast
oxagast

Reputation: 417

Why does gcc produce different compiled binaries for programs that use different forms of integer literals?

I was wondering what the difference between:

int a = 0b00000100;
int a = 0x04;
int a = 4;

When compiled with gcc.

I seem to get a different binary when compiling with what seems to be the same number, just in different notations. When I run objdump on it however, there doesn't seem to be any differences. Could somebody tell me what's going on?

This is my output:

[email protected]:[~]: cat testbin.c && echo && cat testbin2.c
#include "stdio.h"
int main () {
  int a = 0b00000100;
  int b = 0x05;
  int c = 6;
  printf("%d - %d - %d\n", a, b, c);
  return (0);
}

#include "stdio.h"
int main () {
  int a = 4;
  int b = 5;
  int c = 6;
  printf("%d - %d - %d\n", a, b, c);
  return (0);
}
[email protected]:[~]: gcc testbin.c -o testbin
[email protected]:[~]: gcc testbin2.c -o testbin2
[email protected]:[~]: md5sum testbin testbin2
fd6aaa31bdf685ea9444e1edc209565e  testbin
3a3fc241bfc2917ee29999b5befecd2a  testbin2
[email protected]:[~]: objdump -d testbin > testbin.obj && objdump -d testbin2 > testbin2.obj
[email protected]:[~]: diff testbin.obj testbin2.obj
2c2
< testbin:     file format elf64-x86-64
---
> testbin2:     file format elf64-x86-64
[email protected]:[~]: gcc -v
Using built-in specs.
COLLECT_GCC=gcc
COLLECT_LTO_WRAPPER=/usr/lib/gcc/x86_64-linux-gnu/6/lto-wrapper
Target: x86_64-linux-gnu
Configured with: ../src/configure -v --with-pkgversion='Debian 6.3.0-18' --with-bugurl=file:///usr/share/doc/gcc-6/README.Bugs --enable-languages=c,ada,c++,java,go,d,fortran,objc,obj-c++ --prefix=/usr --program-suffix=-6 --program-prefix=x86_64-linux-gnu- --enable-shared --enable-linker-build-id --libexecdir=/usr/lib --without-included-gettext --enable-threads=posix --libdir=/usr/lib --enable-nls --with-sysroot=/ --enable-clocale=gnu --enable-libstdcxx-debug --enable-libstdcxx-time=yes --with-default-libstdcxx-abi=new --enable-gnu-unique-object --disable-vtable-verify --enable-libmpx --enable-plugin --enable-default-pie --with-system-zlib --disable-browser-plugin --enable-java-awt=gtk --enable-gtk-cairo --with-java-home=/usr/lib/jvm/java-1.5.0-gcj-6-amd64/jre --enable-java-home --with-jvm-root-dir=/usr/lib/jvm/java-1.5.0-gcj-6-amd64 --with-jvm-jar-dir=/usr/lib/jvm-exports/java-1.5.0-gcj-6-amd64 --with-arch-directory=amd64 --with-ecj-jar=/usr/share/java/eclipse-ecj.jar --with-target-system-zlib --enable-objc-gc=auto --enable-multiarch --with-arch-32=i686 --with-abi=m64 --with-multilib-list=m32,m64,mx32 --enable-multilib --with-tune=generic --enable-checking=release --build=x86_64-linux-gnu --host=x86_64-linux-gnu --target=x86_64-linux-gnu
Thread model: posix
gcc version 6.3.0 20170516 (Debian 6.3.0-18)
[email protected]:[~]:

Notice that the executables are different, they have different hashes, but objdump -d doesn't show anything different.

Upvotes: 1

Views: 848

Answers (1)

templatetypedef
templatetypedef

Reputation: 372814

I think that the issue has nothing to do with the integer formats and everything to do with the filenames.

I compiled the following program twice, first using the filename FIRST_PROG.c and executable name COMPILED_1 and the second time using the filename SECOND_PROC.c and executable name COMPILED_2 using gcc with no other flags set:

int main() {
    return 0;
}

If you hd the contents of the generated executable, at a certain offset you see this:

00001720  66 72 61 6d 65 5f 64 75  6d 6d 79 5f 69 6e 69 74  |frame_dummy_init|
00001730  5f 61 72 72 61 79 5f 65  6e 74 72 79 00 46 49 52  |_array_entry.FIR|
00001740  53 54 5f 50 52 4f 47 2e  63 00 5f 5f 46 52 41 4d  |ST_PROG.c.__FRAM|

Notice that the name of the source file, FIRST_PROG.c, is embedded into the generated executable. Looking at the same location in the second file shows this:

00001720  66 72 61 6d 65 5f 64 75  6d 6d 79 5f 69 6e 69 74  |frame_dummy_init|
00001730  5f 61 72 72 61 79 5f 65  6e 74 72 79 00 53 45 43  |_array_entry.SEC|
00001740  4f 4e 44 5f 50 52 4f 47  2e 63 00 5f 5f 46 52 41  |OND_PROG.c.__FRA|

You can see SECOND_PROG.c is embedded into the binary as well.

Dumping both executables with objdump -s doesn't show this anywhere, which matches the clean diff you had from your programs. However, using readelf -a to list the contents of the executable that's generated does show this:

Symbol table '.symtab' contains 66 entries:
   Num:    Value          Size Type    Bind   Vis      Ndx Name
     0: 0000000000000000     0 NOTYPE  LOCAL  DEFAULT  UND 
     1: 0000000000400238     0 SECTION LOCAL  DEFAULT    1 
     2: 0000000000400254     0 SECTION LOCAL  DEFAULT    2 
     3: 0000000000400274     0 SECTION LOCAL  DEFAULT    3 
     4: 0000000000400298     0 SECTION LOCAL  DEFAULT    4 
     5: 00000000004002b8     0 SECTION LOCAL  DEFAULT    5 
     6: 0000000000400300     0 SECTION LOCAL  DEFAULT    6 
     7: 0000000000400338     0 SECTION LOCAL  DEFAULT    7 
     8: 0000000000400340     0 SECTION LOCAL  DEFAULT    8 
     9: 0000000000400360     0 SECTION LOCAL  DEFAULT    9 
    10: 0000000000400378     0 SECTION LOCAL  DEFAULT   10 
    11: 0000000000400390     0 SECTION LOCAL  DEFAULT   11 
    12: 00000000004003b0     0 SECTION LOCAL  DEFAULT   12 
    13: 00000000004003d0     0 SECTION LOCAL  DEFAULT   13 
    14: 00000000004003e0     0 SECTION LOCAL  DEFAULT   14 
    15: 0000000000400564     0 SECTION LOCAL  DEFAULT   15 
    16: 0000000000400570     0 SECTION LOCAL  DEFAULT   16 
    17: 0000000000400574     0 SECTION LOCAL  DEFAULT   17 
    18: 00000000004005a8     0 SECTION LOCAL  DEFAULT   18 
    19: 0000000000600e10     0 SECTION LOCAL  DEFAULT   19 
    20: 0000000000600e18     0 SECTION LOCAL  DEFAULT   20 
    21: 0000000000600e20     0 SECTION LOCAL  DEFAULT   21 
    22: 0000000000600e28     0 SECTION LOCAL  DEFAULT   22 
    23: 0000000000600ff8     0 SECTION LOCAL  DEFAULT   23 
    24: 0000000000601000     0 SECTION LOCAL  DEFAULT   24 
    25: 0000000000601020     0 SECTION LOCAL  DEFAULT   25 
    26: 0000000000601030     0 SECTION LOCAL  DEFAULT   26 
    27: 0000000000000000     0 SECTION LOCAL  DEFAULT   27 
    28: 0000000000000000     0 FILE    LOCAL  DEFAULT  ABS crtstuff.c
    29: 0000000000600e20     0 OBJECT  LOCAL  DEFAULT   21 __JCR_LIST__
    30: 0000000000400410     0 FUNC    LOCAL  DEFAULT   14 deregister_tm_clones
    31: 0000000000400450     0 FUNC    LOCAL  DEFAULT   14 register_tm_clones
    32: 0000000000400490     0 FUNC    LOCAL  DEFAULT   14 __do_global_dtors_aux
    33: 0000000000601030     1 OBJECT  LOCAL  DEFAULT   26 completed.7585
    34: 0000000000600e18     0 OBJECT  LOCAL  DEFAULT   20 __do_global_dtors_aux_fin
    35: 00000000004004b0     0 FUNC    LOCAL  DEFAULT   14 frame_dummy
    36: 0000000000600e10     0 OBJECT  LOCAL  DEFAULT   19 __frame_dummy_init_array_
    37: 0000000000000000     0 FILE    LOCAL  DEFAULT  ABS FIRST_PROG.c
    38: 0000000000000000     0 FILE    LOCAL  DEFAULT  ABS crtstuff.c
    39: 0000000000400698     0 OBJECT  LOCAL  DEFAULT   18 __FRAME_END__
    40: 0000000000600e20     0 OBJECT  LOCAL  DEFAULT   21 __JCR_END__
    41: 0000000000000000     0 FILE    LOCAL  DEFAULT  ABS 
    42: 0000000000600e18     0 NOTYPE  LOCAL  DEFAULT   19 __init_array_end
    43: 0000000000600e28     0 OBJECT  LOCAL  DEFAULT   22 _DYNAMIC
    44: 0000000000600e10     0 NOTYPE  LOCAL  DEFAULT   19 __init_array_start
    45: 0000000000400574     0 NOTYPE  LOCAL  DEFAULT   17 __GNU_EH_FRAME_HDR
    46: 0000000000601000     0 OBJECT  LOCAL  DEFAULT   24 _GLOBAL_OFFSET_TABLE_
    47: 0000000000400560     2 FUNC    GLOBAL DEFAULT   14 __libc_csu_fini
    48: 0000000000000000     0 NOTYPE  WEAK   DEFAULT  UND _ITM_deregisterTMCloneTab
    49: 0000000000601020     0 NOTYPE  WEAK   DEFAULT   25 data_start
    50: 0000000000601030     0 NOTYPE  GLOBAL DEFAULT   25 _edata
    51: 0000000000400564     0 FUNC    GLOBAL DEFAULT   15 _fini
    52: 0000000000000000     0 FUNC    GLOBAL DEFAULT  UND __libc_start_main@@GLIBC_
    53: 0000000000601020     0 NOTYPE  GLOBAL DEFAULT   25 __data_start
    54: 0000000000000000     0 NOTYPE  WEAK   DEFAULT  UND __gmon_start__
    55: 0000000000601028     0 OBJECT  GLOBAL HIDDEN    25 __dso_handle
    56: 0000000000400570     4 OBJECT  GLOBAL DEFAULT   16 _IO_stdin_used
    57: 00000000004004f0   101 FUNC    GLOBAL DEFAULT   14 __libc_csu_init
    58: 0000000000601038     0 NOTYPE  GLOBAL DEFAULT   26 _end
    59: 00000000004003e0    42 FUNC    GLOBAL DEFAULT   14 _start
    60: 0000000000601030     0 NOTYPE  GLOBAL DEFAULT   26 __bss_start
    61: 00000000004004d6    11 FUNC    GLOBAL DEFAULT   14 main
    62: 0000000000000000     0 NOTYPE  WEAK   DEFAULT  UND _Jv_RegisterClasses
    63: 0000000000601030     0 OBJECT  GLOBAL HIDDEN    25 __TMC_END__
    64: 0000000000000000     0 NOTYPE  WEAK   DEFAULT  UND _ITM_registerTMCloneTable
    65: 0000000000400390     0 FUNC    GLOBAL DEFAULT   11 _init

Notice that entry 37 contains the name of the source file. If you try diffing the output of readelf -a, you do get some pretty helpful information:

81c81
<   [28] .shstrtab         STRTAB           0000000000000000  0000189f
---
>   [28] .shstrtab         STRTAB           0000000000000000  000018a0
86c86
<        0000000000000207  0000000000000000           0     0     1
---
>        0000000000000208  0000000000000000           0     0     1
211c211
<     37: 0000000000000000     0 FILE    LOCAL  DEFAULT  ABS FIRST_PROG.c
---
>     37: 0000000000000000     0 FILE    LOCAL  DEFAULT  ABS SECOND_PROG.c
258c258
<     Build ID: 2c64961288049002e34a1f14e55d6c80dd96816c
---
>     Build ID: 5425dec81aae53bd30e85fe94659d320bb774dcc

It seems like many of these differences boil down to just having a different name for the source file.

So my official answer is "this has nothing whatsoever to do with integer literals and is purely a function of compiling files with different names."

Upvotes: 1

Related Questions