firebush
firebush

Reputation: 5880

How to find the functions used by a particular library in a C++ file

I'm working with legacy C++ code compiled with g++. The files in question are compiled using a library. My goal is to determine every use of a function or macro from a particular library in each of these files. (In my case, OpenSSL is the library in question, and I'll reference it as such throughout the rest of the post. However, I think my question generically applies to any C library I'd compile against.)

I could conceive of this being easier if OpenSSL were a C++ library using a namespace - I could simply grep on the namespace to find the OpenSSL functions. Since, however, it is a C library, undecorated OpenSSL functions and macros are sprinkled across some the source files and I can't readily tell by scanning the source which functions are from OpenSSL and which are other local functions or functions from other libraries.

Looking through Stack Overflow, I see questions like this for the Windows environment, but I don't see any answers for a Linux environment. Broadening my search, I see references to nm and objdump, but if it's possible to get the details I'm looking for from these tools from an object file, I can't figure out the correct parameters to use.

Thanks in advance for your help!

Upvotes: 3

Views: 2797

Answers (3)

firebush
firebush

Reputation: 5880

A coworker of mine was able to get this information using nm. Here's the procedure we followed:

Get the List of Symbols

As suggested by riodoro1 above, the list of objects from the library used by your code can be obtained by linking without the library (without -lcrypto in my case, for instance). Alternatively, this can be obtained as described below using nm

  1. Run nm on all relevant objects:

    find . -name '*.o' -exec nm {} \; > nm.txt
    
  2. Find undefined symbols referenced by objects and strip symbols:

    grep '^ *U' nm.txt > nm2.txt
    
  3. Remove C++ symbols (mangled names begin with _Z), uniquify those remaining:

    grep -v ' _Z' nm2.txt | sort | uniq > nm3.txt
    
  4. Manually edit nm3.txt, remove symbols not part of openssl, write to nm4.txt.

Use the Preprocessor to Expand Macros

  1. Build the cc files normally, capture output to log file. Isolate the lines that show the commands that compiled lotus source files. Search and replace in the output to produce commands to invoke the preprocessor. Change:

    • -o .../file.o => -o .../file.i
    • ' -c ' => ' -E '
  2. Run the modified commands to produce preprocessor output.

  3. The preprocessor output contains the full text from all included header files, followed by the preprocessed C code. Headers are long and uninteresting so strip them from the output. We'll get just C code with expanded macros.

    bash -c 'for f in `find . -name "*.i"`; do cat "${f}" | perl cat-preproc-without-headers.pl > "${f}"cc; done'
    

Here's the contents of cat-preproc-without-headers.pl:

#!/usr/bin/perl

# Write lines to stdout if cat != 0
$cat = 0;

while(<>) {
    if(/^# [1-9]\d* .*\.cc/) {
        $cat = 1;
    } elsif(/^# [0-9]/) {
        $cat = 0;
    } elsif($cat) {
        print;
    }
}

Conclusion

With the list of symbols and the expanded macros, you now have all the symbols from the library and the places where they are used in the source code.

Upvotes: 2

AdrianRK
AdrianRK

Reputation: 129

I don't think there is a simple and quick solution for this, you will have to do some work for this. There are three ways your software might link with openssl.

  1. Static linking.
  2. Dynamic link with the runtime linker
  3. Manual linking with dlopen.

In all cases, the best solution would be to remove the header files and the openssl library from their location and recompile the code. If you do not have access to the code you have to use nm or objdump to get the symbols from your executable and cross reference them with the ones in the openssl library. This will not work if you are using dlopen to link the library. Another option would be to get the openssl library and recompile it with tracing enabled and execute your code with the new library.

The nm tool is used to list all the symbols in an object, regardless if it is a library or an executable. You can make a bash script that cross-references the output of calling nm on the openssl library and on your executable. The way to call this is nm objname. The third column is the one with the symbols.

objdump is a more precise tool that you can use to list all the symbols that are undefined in your. You can use it to list the header of your executable (objdump -h objname), this normally lists all the libraries your executable needs at runtime to run. If openssl is listed here then this means you are linking against it dynamically with the run time linker. You can use objdump -R with openssl to get the symbols in the openssl interface. You can cross-reference this with the symbols listed when calling objdump -r with your executable

Upvotes: 2

riodoro1
riodoro1

Reputation: 1256

As per @firebrush suggestion I post my comment as an answer (maybe for posterity).

In order to see where the library functions are used You can remove the library from linking and see what .o files have missing references.

Upvotes: 1

Related Questions