user2371524
user2371524

Reputation:

How to make a linux shared object (library) runnable on its own?

Noticing that gcc -shared creates an executable file, I just got the weird idea to check what happens when I try to run it ... well the result was a segfault for my own lib. So, being curious about that, I tried to "run" the glibc (/lib/x86_64-linux-gnu/libc.so.6 on my system). Sure enough, it didn't crash but provided me some output:

GNU C Library (Debian GLIBC 2.19-18) stable release version 2.19, by Roland McGrath et al.
Copyright (C) 2014 Free Software Foundation, Inc.
This is free software; see the source for copying conditions.
There is NO warranty; not even for MERCHANTABILITY or FITNESS FOR A
PARTICULAR PURPOSE.
Compiled by GNU CC version 4.8.4.
Compiled on a Linux 3.16.7 system on 2015-04-14.
Available extensions:
    crypt add-on version 2.1 by Michael Glad and others
    GNU Libidn by Simon Josefsson
    Native POSIX Threads Library by Ulrich Drepper et al
    BIND-8.2.3-T5B
libc ABIs: UNIQUE IFUNC
For bug reporting instructions, please see:
<http://www.debian.org/Bugs/>.

So my question here is: what is the magic behind this? I can't just define a main symbol in a library -- or can I?

Upvotes: 21

Views: 4676

Answers (2)

xjossy
xjossy

Reputation: 111

While linking with -shared gcc strips start files, and some objects (like cout) will not be initialized. So, std::cout << "Abc" << std::endl will cause SEGFAULT.

Approach 1

(simplest way to create executable library)

To fix it change linker options. The simplest way - run gcc to build executable with -v option (verbose) and see the linker command line. In this command line you should remove -z now, -pie (if present) and add -shared. The sources must be anyway compiled with -fPIC (not -fPIE).

Let's try. For example we have the following x.cpp:

#include <iostream>

// The next line is required, while building executable gcc will
// anyway include full path to ld-linux-x86-64.so.2:
extern "C" const char interp_section[] __attribute__((section(".interp"))) = "/lib64/ld-linux-x86-64.so.2";

// some "library" function
extern "C"  __attribute__((visibility("default"))) int aaa() {
    std::cout << "AAA" << std::endl;
    return 1234;
}

// use main in a common way
int main() {
    std::cout << "Abc" << std::endl;
}

Firstly compile this file via g++ -c x.cpp -fPIC. Then will link it dumping command-line via g++ x.o -o x -v.

We will get correct executable, which can't be dynamically loaded as a shared library. Check this by python script check_x.py:

import ctypes
d = ctypes.cdll.LoadLibrary('./x')
print(d.aaa())

Running $ ./x will be successful. Running $ python check_x.py will fail with OSError: ./x: cannot dynamically load position-independent executable.

While linking g++ calls collect2 linker wraper which calls ld. You can see command-line for collect2 in the output of last g++ command like this:

/usr/lib/gcc/x86_64-linux-gnu/11/collect2 -plugin /usr/lib/gcc/x86_64-linux-gnu/11/liblto_plugin.so -plugin-opt=/usr/lib/gcc/x86_64-linux-gnu/11/lto-wrapper -plugin-opt=-fresolution=/tmp/ccqDN9Df.res -plugin-opt=-pass-through=-lgcc_s -plugin-opt=-pass-through=-lgcc -plugin-opt=-pass-through=-lc -plugin-opt=-pass-through=-lgcc_s -plugin-opt=-pass-through=-lgcc --build-id --eh-frame-hdr -m elf_x86_64 --hash-style=gnu --as-needed -dynamic-linker /lib64/ld-linux-x86-64.so.2 -pie -z now -z relro -o x /usr/lib/gcc/x86_64-linux-gnu/11/../../../x86_64-linux-gnu/Scrt1.o /usr/lib/gcc/x86_64-linux-gnu/11/../../../x86_64-linux-gnu/crti.o /usr/lib/gcc/x86_64-linux-gnu/11/crtbeginS.o -L/usr/lib/gcc/x86_64-linux-gnu/11 -L/usr/lib/gcc/x86_64-linux-gnu/11/../../../x86_64-linux-gnu -L/usr/lib/gcc/x86_64-linux-gnu/11/../../../../lib -L/lib/x86_64-linux-gnu -L/lib/../lib -L/usr/lib/x86_64-linux-gnu -L/usr/lib/../lib -L/usr/lib/gcc/x86_64-linux-gnu/11/../../.. x.o -lstdc++ -lm -lgcc_s -lgcc -lc -lgcc_s -lgcc /usr/lib/gcc/x86_64-linux-gnu/11/crtendS.o /usr/lib/gcc/x86_64-linux-gnu/11/../../../x86_64-linux-gnu/crtn.o

Find there -pie -z now and replace with -shared. After running this command you will get new x executable, which will wonderfully work as an executable and as a shared library:

$ ./x
Abc
$ python3 check_x.py
AAA
1234

This approach has disadvantages: it is hard to do replacement automatically. Also before calling collect2 GCC will create a temporary file for LTO plugin (link-time optimization). This temporary file will be missing while you running the command manually.

Approach 2

(applicable way to create executable library)

The idea is to change linker for GCC to own wrapper which will correct arguments for collect2. We will use the following Python script collect3.py as linker:

#!/usr/bin/python3
import subprocess, sys, os

marker = '--_wrapper_make_runnable_so'

def sublist_index(haystack, needle):
    for i in range(len(haystack) - len(needle)):
        if haystack[i:i+len(needle)] == needle: return i

def remove_sublist(haystack, needle):
    idx = sublist_index(haystack, needle)
    if idx is None: return haystack

    return haystack[:idx] + haystack[idx+len(needle):]

def fix_args(args):
    #print("!!BEFORE REPLACE ", *args)
    if marker not in args:
         return args

    args = remove_sublist(args, [marker])
    args = remove_sublist(args, ['-z', 'now'])
    args = remove_sublist(args, ['-pie'])

    args.append('-shared')
    #print("!!AFTER REPLACE ", *args)

    return args

# get search paths for linker directly from gcc
def findPaths(prefix = "programs: ="):
    for line in subprocess.run(['gcc', '-print-search-dirs'], stdout=subprocess.PIPE).stdout.decode('utf-8').split('\n'):
        if line.startswith(prefix): return line[len(prefix):].split(':')

# get search paths for linker directly from gcc
def findLinker(linker_name = 'collect2'):
    for p in findPaths():
        candidate = os.path.join(p, linker_name)
        #print("!!CHECKING LINKER ", candidate)
        if os.path.exists(candidate) : return candidate

if __name__=='__main__':
    args = sys.argv[1:]
    args = fix_args(args)
    exit(subprocess.call([findLinker(), *args]))

This script will replace arguments and call true linker. To switch linker we will create the file specs.txt with the following content:

*linker:
<full path to>/collect3.py

To tell our fake linker that we want to correct arguments we will use the additional argument --_wrapper_make_runnable_so. So, the complete command line will be the following:

g++ -specs=specs.txt -Wl,--_wrapper_make_runnable_so x.o -o x

(we suppose that you want to link existing x.o).

After this you can both run the target x and use it as dynamic library.

Upvotes: 0

jacwah
jacwah

Reputation: 2847

I wrote a blog post on this subject where I go more in depth because I found it intriguing. You can find my original answer below.


You can specify a custom entry point to the linker with the -Wl,-e,entry_point option to gcc, where entry_point is the name of the library's "main" function.

void entry_point()
{
    printf("Hello, world!\n");
}

The linker doesn't expect something linked with -shared to be run as an executable, and must be given some more information for the program to be runnable. If you try to run the library now, you will encounter a segmentation fault.

The .interp section is a part of the resulting binary that is needed by the OS to run the application. It's set automatically by the linker if -shared is not used. You must set this section manually in the C code if building a shared library that you want to execute by itself. See this question.

The interpreter's job is to find and load the shared libraries needed by a program, prepare the program to run, and then run it. For the ELF format (ubiquitous for modern *nix) on Linux, the ld-linux.so program is used. See it's man page for more info.

The line below puts a string in the .interp section using GCC attributes. Put this in the global scope of your library to explicitly tell the linker that you want to include a dynamic linker path in your binary.

const char interp_section[] __attribute__((section(".interp"))) = "/path/to/ld-linux";

The easiest way to find the path to ld-linux.so is to run ldd on any normal application. Sample output from my system:

jacwah@jacob-mint17 ~ $ ldd $(which gcc)
    linux-vdso.so.1 =>  (0x00007fff259fe000)
    libc.so.6 => /lib/x86_64-linux-gnu/libc.so.6 (0x00007faec5939000)
    /lib64/ld-linux-x86-64.so.2 (0x00007faec5d23000)

Once you've specified the interpreter your library should be executable! There's just one slight flaw: it will segfault when entry_point returns.

When you compile a program with main, it's not the first function to be called when executing it. main is actually called by another function called _start. This function is responsible for setting up argv and argc and other initialisation. It then calls main. When main returns, _start calls exit with the return value of main.

There's no return address on stack in _start as it's the first function to be called. If it tries to return, an invalid read occurs (ultimately causing a segmentation fault). This is exactly what is happening in our entry point function. Add a call to exit as the last line of your entry function to properly clean up and not crash.

example.c

#include <stdio.h>
#include <stdlib.h>

const char interp_section[] __attribute__((section(".interp"))) = "/path/to/ld-linux";

void entry_point()
{
    printf("Hello, world!\n");
    exit(0);
}

Compile with gcc example.c -shared -fPIC -Wl,-e,entry_point.

Upvotes: 27

Related Questions