Grigory Rechistov
Grigory Rechistov

Reputation: 2214

Capture all compiler invocations and command line parameters during build

I want to run tools for static C/C++ (and possibly Python, Java etc.) code analysis for a large software project built with help of make. As it is known, make (or any other build tool) invokes compiler and similar tools for specified source code files. It is also possible to control compilation by defining environmental variables to be later passed to the compiler via its arguments.

The key to accurate static analysis is to provide defines and include paths exactly as they were passed to the compiler (basically all its -D and -I arguments). This way, the tool will be able to follow same code paths the compiler have followed.

The problem is, the high complexity of the project means there is no way to statically determine such environment, as different files are built with different sets of defines/include paths and other compilation flags.

The idea is that it should be somehow possible to capture individual invocations of the compiler with all arguments passed to it for each input file. Having such information and after its straightforward filtering (e.g. there is no need to know -O optimization levels or -W warning settings) it should be possible to invoke the static analyzer for each input file with the identical set of defines/includes used just for that input file.

The question is: are there existing tools/workflows that implement the idea I've described? I am mostly interested in a solution for POSIX systems, but ideas for Windows are also welcome.

A few ideas I've come to on my own.

Upvotes: 2

Views: 987

Answers (2)

Svyatoslav Razmyslov
Svyatoslav Razmyslov

Reputation: 411

There are the following ways to gather information about the parameters of compilation in Linux:

  1. Override environment CC/CXX variables. It is used in the utility scan-build from Clang Analyzer. This method works reliably only with simple projects for Make.

  2. procfs - all the information on the processes is stored in /proc/PID/... . Reading from a disk is a slow process, you might not be able to receive information about all processes of a build.

  3. strace utility (ptrace library). The output of this utility contains a lot of useful information, but it requires a complicated parsing, because information is written randomly. If you do not use many threads to build the project, it is a fairly reliable way to gather information about the processes. It’s used in PVS-Studio.

  4. JSON Compilation Database in CMake. You can get all the compilation parameters using the definition -DCMAKE_EXPORT_COMPILE_COMMANDS=On. It is a reliable method if a project does not depend on non-standard environment variables. Also the project for CMake can be written with errors and issue incorrect Json, although this doesn’t affect the project build. It’s supported in PVS-Studio.

  5. Bear utility (function substitution using LD_PRELOAD). You can get JSON Database Compilation for any project. But without environment variables it’ll be impossible to run the analyzer for some projects. Also, you cannot use it with projects, which already use LD_PRELOAD for a build. It’s supported in PVS-Studio.

Collecting information about compiling in Windows for PVS-Studio:

  1. Visual Studio API to get the compilation parameters of standard projects;

  2. MSBuild API to get the compilation parameters of standard projects;

  3. Win API to get the information on any compilation processes as, for example, Windows Task Manager does it.

Upvotes: 4

mattmilten
mattmilten

Reputation: 6716

VERBOSE=true is a default make option to display all commands with all parameters. It also works with CMake, for instance.

You might want to look at Coverity. They are attaching their tool to the compiler to get everything that the compiler receives. You could overwrite the environment variables CC or CXX to first collect everything and then call the compiler as usual.

Upvotes: 0

Related Questions