Reputation: 1139

Find program's code address at runtime?

When I use gdb to debug a program written in C, the command disassemble shows the codes and their addresses in the code memory segmentation. Is it possible to know those memory addresses at runtime? I am using Ubuntu OS. Thank you.

[edit] To be more specific, I will demonstrate it with following example.

#include <stdio.h>

int main(int argc,char *argv[]){
    myfunction();
    exit(0);
}

Now I would like to have the address of myfunction() in the code memory segmentation when I run my program.

Upvotes: 7

Answers (4)

Stefano Borini

Reputation: 143755

About a comment in an answer (getting the address of an instruction), you can use this very ugly trick

#include <setjmp.h> 

void function() {
    printf("in function\n");
    printf("%d\n",__LINE__);
    printf("exiting function\n");

}

int main() {
    jmp_buf env;
    int i;

    printf("in main\n");
    printf("%d\n",__LINE__);
    printf("calling function\n");
    setjmp(env);
    for (i=0; i < 18; ++i) {
        printf("%p\n",env[i]);
    }    
    function();
    printf("in main again\n");
    printf("%d\n",__LINE__);

}

It should be env[12] (the eip), but be careful as it looks machine dependent, so triple check my word. This is the output

in main
13
calling function
0xbfff037f
0x0
0x1f80
0x1dcb
0x4
0x8fe2f50c
0x0
0x0
0xbffff2a8
0xbffff240
0x1f
0x292
0x1e09
0x17
0x8fe0001f
0x1f
0x0
0x37
in function
4
exiting function
in main again
37

have fun!

Upvotes: 3

Adrian Panasiuk

Reputation: 7343

To get a backtrace, use execinfo.h as documented in the GNU libc manual.

For example:

#include <execinfo.h>
#include <stdio.h>
#include <unistd.h>


void trace_pom()
{   
    const int sz = 15;
    void *buf[sz];

    // get at most sz entries
    int n = backtrace(buf, sz);

    // output them right to stderr
    backtrace_symbols_fd(buf, n, fileno(stderr));

    // but if you want to output the strings yourself
    // you may use char ** backtrace_symbols (void *const *buffer, int size)
    write(fileno(stderr), "\n", 1);
}


void TransferFunds(int n);

void DepositMoney(int n)
{   
    if (n <= 0)
        trace_pom();
    else TransferFunds(n-1);
}


void TransferFunds(int n)
{   
    DepositMoney(n);
}


int main()
{   
    DepositMoney(3);

    return 0;
}

compiled

gcc a.c -o a -g -Wall -Werror -rdynamic

According to the mentioned website:

Currently, the function name and offset only be obtained on systems that use the ELF binary format for programs and libraries. On other systems, only the hexadecimal return address will be present. Also, you may need to pass additional flags to the linker to make the function names available to the program. (For example, on systems using GNU ld, you must pass (-rdynamic.)

Output

./a(trace_pom+0xc9)[0x80487fd]
./a(DepositMoney+0x11)[0x8048862]
./a(TransferFunds+0x11)[0x8048885]
./a(DepositMoney+0x21)[0x8048872]
./a(TransferFunds+0x11)[0x8048885]
./a(DepositMoney+0x21)[0x8048872]
./a(TransferFunds+0x11)[0x8048885]
./a(DepositMoney+0x21)[0x8048872]
./a(main+0x1d)[0x80488a4]
/lib/i686/cmov/libc.so.6(__libc_start_main+0xe5)[0xb7e16775]
./a[0x80486a1]

Upvotes: 5

ZelluX

Reputation: 72585

If you know the function name before program runs, simply use

void * addr = myfunction;

If the function name is given at run-time, I once wrote a function to find out the symbol address dynamically using bfd library. Here is the x86_64 code, you can get the address via find_symbol("a.out", "myfunction") in the example.

#include <bfd.h>
#include <stdio.h>
#include <stdlib.h>
#include <type.h>
#include <string.h>

long find_symbol(char *filename, char *symname)
{
    bfd *ibfd;
    asymbol **symtab;
    long nsize, nsyms, i;
    symbol_info syminfo;
    char **matching;

    bfd_init();
    ibfd = bfd_openr(filename, NULL);

    if (ibfd == NULL) {
        printf("bfd_openr error\n");
    }

    if (!bfd_check_format_matches(ibfd, bfd_object, &matching)) {
        printf("format_matches\n");
    }

    nsize = bfd_get_symtab_upper_bound (ibfd);
    symtab = malloc(nsize);
    nsyms = bfd_canonicalize_symtab(ibfd, symtab);

    for (i = 0; i < nsyms; i++) {
        if (strcmp(symtab[i]->name, symname) == 0) {
            bfd_symbol_info(symtab[i], &syminfo);
            return (long) syminfo.value;
        }
    }

    bfd_close(ibfd);
    printf("cannot find symbol\n");
}

Upvotes: 9

Andy Ross

Reputation: 12033

Above answer is vastly overcomplicated. If the function reference is static, as it is above, the address is simply the value of the symbol name in pointer context:

void* myfunction_address = myfunction;

If you are grabbing the function dynamically out of a shared library, then the value returned from dlsym() (POSIX) or GetProcAddress() (windows) is likewise the address of the function.

Note that the above code is likely to generate a warning with some compilers, as ISO C technically forbids assignment between code and data pointers (some architectures put them in physically distinct address spaces).

And some pedants will point out that the address returned isn't really guaranteed to be the memory address of the function, it's just a unique value that can be compared for equality with other function pointers and acts, when called, to transfer control to the function whose pointer it holds. Obviously all known compilers implement this with a branch target address.

And finally, note that the "address" of a function is a little ambiguous. If the function was loaded dynamically or is an extern reference to an exported symbol, what you really get is generally a pointer to some fixup code in the "PLT" (a Unix/ELF term, though the PE/COFF mechanism on windows is similar) that then jumps to the function.

Upvotes: 16

Find program&#39;s code address at runtime?

Answers (4)

Related Questions

Find program's code address at runtime?