Martin
Martin

Reputation: 9359

Is there a way to "statically" interpose a shared .so (or .o) library into an executable?

First of all, consider the following case.

Below is a program:

// test.cpp
extern "C" void printf(const char*, ...);

int main() {
        printf("Hello");
}

Below is a library:

// ext.cpp (the external library)
#include <iostream>

extern "C" void printf(const char* p, ...);

void printf(const char* p, ...) {
        std::cout << p << " World!\n";
}

Now I can compile the above program and library in two different ways.

The first way is to compile the program without linking the external library:

$ g++ test.cpp -o test
$ ldd test
        linux-gate.so.1 =>  (0xb76e8000)
        libc.so.6 => /lib/i386-linux-gnu/libc.so.6 (0xb7518000)
        /lib/ld-linux.so.2 (0xb76e9000)

If I run the above program, it will print:

$ ./test 
Hello

The second way is to compile the program with a link to the external library:

$ g++ -shared -fPIC ext.cpp -o libext.so
$ g++ test.cpp -L./ -lext  -o test
$ export LD_LIBRARY_PATH=./
$ ldd test
        linux-gate.so.1 =>  (0xb773e000)
        libext.so => ./libext.so (0xb7738000)
        libc.so.6 => /lib/i386-linux-gnu/libc.so.6 (0xb756b000)
        libstdc++.so.6 => /usr/lib/i386-linux-gnu/libstdc++.so.6 (0xb7481000)
        /lib/ld-linux.so.2 (0xb773f000)
        libm.so.6 => /lib/i386-linux-gnu/libm.so.6 (0xb743e000)
        libgcc_s.so.1 => /lib/i386-linux-gnu/libgcc_s.so.1 (0xb7421000)
$ ./test
Hello World!

As you can see, in the first case the program uses printf from libc.so, while in the second case it uses printf from libext.so.

My question is: from the executable obtained as in the first case and the object code of libext (either as .so or .o), is it possible to obtain an executable like in the second case? In other words, is it possible to replace the link to libc.so with a link to libext.so for all symbols defined in the latter?

**Note that interposition via LD_PRELOAD is not what I want. I want to obtain an exectuable which is directly linked to the libraries I need. I underline again that fact the I only have access to the first binary and to the external object I want to "statically" interpose **

Upvotes: 16

Views: 3645

Answers (8)

Maxim Egorushkin
Maxim Egorushkin

Reputation: 136208

It is possible. Learn about shared library interposition:

When a program that uses dynamic libraries is compiled, a list of undefined symbols is included in the binary, along with a list of libraries the program is linked with. There is no correspondence between the symbols and the libraries; the two lists just tell the loader which libraries to load and which symbols need to be resolved. At runtime, each symbol is resolved using the first library that provides it. This means that if we can get a library containing our wrapper functions to load before other libraries, the undefined symbols in the program will be resolved to our wrappers instead of the real functions.

Upvotes: 9

Steve Lorimer
Steve Lorimer

Reputation: 28659

Not statically, but you can redirect dynamically loaded symbols in a shared library to your own functions using the elf-hook utility created by Anthony Shoumikhin.

The typical usage is to redirect certain function calls from within a 3rd-party shared library which you can't edit.

Let's say your 3rd party library is located at /tmp/libtest.so, and you want to redirect printf calls made from within the library, but leave calls to printf from other locations unaffected.

Exemplar app:

lib.h

#pragma once

void test();

lib.cpp

#include "lib.h"
#include <cstdio>

void test()
{
    printf("hello from libtest");
}

In this example, the above 2 files are compiled into a shared library libtest.so and stored in /tmp

main.cpp

#include <iostream>
#include <dlfcn.h>
#include <elf_hook.h>
#include "lib.h"

int hooked_printf(const char* p, ...)
{
    std::cout << p << " [[ captured! ]]\n";
    return 0;
}

int main()
{
    // load the 3rd party shared library
    const char* fn = "/tmp/libtest.so";
    void* h = dlopen(fn, RTLD_LAZY);

    // redirect printf calls made from within libtest.so
    elf_hook(fn, LIBRARY_ADDRESS_BY_HANDLE(h), "printf", (void*)hooked_printf);

    printf("hello from my app\n"); // printf in my app is unaffected

    test(); // test is the entry point to the 3rd party library

    dlclose(h);
    return 0;
}

Output

hello from my app
hello from libtest [[ captured! ]]

So as you can see it is possible to interpose your own functions without setting LD_PRELOAD, with the added benefit that you have finer-grained control of which functions are intercepted.

However, the functions are not statically interposed, but rather dynamically redirected

GitHub source for the elf-hook library is here, and a full codeproject article written by Anthony Shoumikhin is here

Upvotes: 0

jambono
jambono

Reputation: 408

It is possible to change the binary.

For example with a tool like ghex you can change the hexadecimal code of the binary, you search in the code for each instance of libc.so and you replace it by libext.so

Upvotes: 0

Dmitry
Dmitry

Reputation: 3169

It's possible. You just need to edit ELF header and add your library in Dynamic section. You can check contents of "Dynamic section" using readelf -d <executable>. Also readelf -S <executable> will tell you offset of .dynsym and .dynstr. In .dynsym you can find array of Elf32_Dyn or Elf64_Dyn structures where your d_tag should be DT_NEEDED and d_un.d_ptr should point to a string "libext.so" located in .dynstr section.

ELF headers are described in /usr/include/elf.h.

Upvotes: 1

TheCodeArtist
TheCodeArtist

Reputation: 22477

What you ask for is traditionally NOT possible. This has already been discussed here and here.

The crux of your question being -

How to statically link a dynamic shared object?

This cannot be done. The reason being the fact that statically linking a library is effectively the same as taking the compilation results of that library, unpacking them in your current project, and using them as if they were your own objects. *.a files are just archives of a bunch of *.o files with all the info intact within them. On the other hand, dynamic libraries are already linked; the symbol re-location info already being discarded and hence cannot be statically linked into an executable.

However you DO have other alternatives to work around this technical limitation.


So what are your options?

1. Use LD_PRELOAD on target system

Shared library interposition is well described in Maxim's answer.

2. Prepare a pre-linked stand-alone executable

elf-statifier is tool for creating portable, self-contained Linux executables.

It attempts to package together a dynamically-linked executable and all the dynamically-linked libraries of into a single stand-alone executable file. This file can be copied and run on another machine independently.

So now on your development machine, you can set LD_PRELOAD and run the original executable and verify that it works properly. At this point elf-statifier creates a snapshot of the process memory image. This snapshot is saved as an ELF executable, with all the required shared-libraries(incluing your custom libext.so) inside. Hence there is no need to make any modifications (for eg. to LD_PRELOAD) on the target system running the newly generated standalone executable.

However, this approach is not guaranteed to work in all scenarios. This is due to the fact that recent Linux kernels introduced VDSO and ASLR.

A commercial alternative to this is ermine. It can work around VDSO and ASLR limitations.

Upvotes: 3

dave
dave

Reputation: 4922

Statifier probably does what you want. It takes an executable and all shared libraries and outputs a static executable.

Upvotes: 1

Tom
Tom

Reputation: 2389

It might be possible to do what you're asking by dynamically loading the library using dlopen(), accessing the symbol for the function as a function pointer using dlsym(), and then invoking it via the function pointer. There's a good example of what to do on this website.

I tailored that example to your example above:

// test.cpp
#include <stdio.h>
typedef void (*printf_t)(const char *p, ...);

int main() {

  // Call the standard library printf
  printf_t my_printf = &printf;
  my_printf("Hello"); // should print "Hello"

  // Now dynamically load the "overloaded" printf and call it instead
  void* handle = dlopen("./libext.so", RTLD_LAZY);
  if (!handle) {
    std::cerr << "Cannot open library: " << dlerror() << std::endl;
    return 1;
  }

  // reset errors
  dlerror();

  my_printf = (printf_t) dlsym(handle, "printf");
  const char *dlsym_error = dlerror();
  if (dlsym_error) {
    std::cerr << "Cannot load symbol 'printf': " << dlsym_error << std::endl;
    dlclose(handle);
    return 1;
  }

  my_printf("Hello"); // should print "Hello, world"

  // close the library
  dlclose(handle);

}

The man page for dlopen and dlsym should provide some more insight. You'll need to try this out, as it is unclear how dlsym will handle the conflicting symbol (in your example, printf) - if it replaces the existing symbol, you may need to "undo" your action later. It really depends on the context of your program, and what you're trying to do overall.

Upvotes: 0

Gabe
Gabe

Reputation: 897

You are going to have to modify the binary. Take a look at patchelf http://nixos.org/patchelf.html

It will let you set or modify either the RPATH or even the "interpreter" i.e. ld-linux-x86-64.so to something else.

From the description of the utility:

Dynamically linked ELF executables always specify a dynamic linker or interpreter, which is a program that actually loads the executable along with all its dynamically linked libraries. (The kernel just loads the interpreter, not the executable.) For example, on a Linux/x86 system the ELF interpreter is typically the file /lib/ld-linux.so.2.

So what you could do is run patchelf on the binary in question (i.e. test) with your own interpreter that then loads your library... This may be difficult, but the source code to ld-linux-so is available...

Option 2 would be to modify the list of libraries yourself. At least patchelf gives you a starting point in that the code iterates over the list of libraries (see DT_NEEDED in the code).

The elf specification documentation does indicate that the order is indeed important:

DT_NEEDED: This element holds the string table offset of a null-terminated string, giving the name of a needed library. The offset is an index into the table recorded in the DT_STRTAB entry. See ‘‘Shared Object Dependencies’’ for more information about these names. The dynamic array may contain multiple entries with this type. These entries’ relative order is significant, though their relation to entries of other types is not.

The nature of your question indicates you are familiar with programming :-) Might be a good time to contribute an addition to patchelf... Modifying library dependencies in a binary.

Or maybe your intention is to do exactly what patchelf was created to do... Anyway, hope this helps!

Upvotes: 2

Related Questions