user236520
user236520

Reputation:

What is the easiest way of automatically inserting code into existing c++ code?

For example, I want to insert a function call after every line. Such as:

for (int i = 0; i < n; ++i)
{
   double d = 2*i;
}

would become

for (int i = 0; i < n; ++i)
{
   myFuncCall();
   double d = 2*i;
   myFuncCall();
}
myFuncCall();

I have been researching generalized c++ parsers but they either seem to be a) commercial, b) incomplete or c) difficult to use

Compilers aren't my life and this is a means to and end so I am looking for the fastest solution

EDIT: The reason I want to do this is we are chasing a nightmare bug where code crashes in release mode but not debug mode. For reasons beyond our control, we can't compile release code with debug symbols, so we are trying to make progress with random print statements. If I could make this work, we would at least immediately know where the code crashes because the inserted statements would act like a trace.

Thanks Andrew

Upvotes: 3

Views: 533

Answers (2)

Ira Baxter
Ira Baxter

Reputation: 95400

Our DMS Software Reengineering Toolkit with its C++ Front End could do this pretty easily. Yes, its commercial but I don't think there's a lot of noncommercial real solutions to your task.

DMS is a program transformation engine, that parses, analyzes, and transforms code according to a supplied langauge definition. Its C++ front end is the language definition for a variety of dialects of C++. As part of the parsing process for C++, DMS can build up compiler-accurate symbol tables. This is needed for OPs task, to distinguish otherwise ambiguous syntax which might be a statement from the alternative declarations such syntax might represent. (See Why can't C++ be parsed with a LR(1) parser? for examples of this).

The value in DMS for this task is that it allows source-to-source transformations to be applied to the abstract syntax trees produced by the parse. The following DMS rule, written in DMS rule syntax, would probably be pretty close to what OP needs:

  domain Cpp~ANSI;

  rule instrument_statements(s: executable_statement):
       executable_statement->executable_statement
  " \s " ->  " { \s ; post_statement_call(); } "

The text inside the meta quotes " ... " is target domain syntax, in this case ANSI C++. The \ is a meta escape; \s represesents any executable statement. What this rule does is match all syntactic executable_statements, and replace them by a block of two statements, the first being the original statement, the second being whatever OP wants done after each statement. I've assumed OP simply wanted to call a function, but he may want something more complex here, perhaps involving printing line numbers, function names, or function parameters [requiring some additional transformation rules].

The pattern matching and the transformation are done using parsed syntax trees, so it can't get confused by the presence of something that looks like code in a string, or a comment, or isn't actually and executable statement (e.g., is a declaration), etc. [There's a minor detail I glossed over to prevent this rule from being applied recursively to its results, but that detail is easily managed using DMS's APIs] After the transformation, the modified syntax tree is regenerated into compilable C++ source text. OP would compile and run that code instead of his original code.

Note that the post_statement need not actually print anything. If it calls a central function, OP can code whatever predicates/print statements he wants to control the amount of output/overhead that the post_statement consumes. In essence, this can act as a programmable breakpoint.

This basic idea of inserting probes by transformational methods is used in our line of COTS test coverage and profiling tools, all based directly on DMS, including our C++ Test Coverage Tool. For more details, see http://www.semdesigns.com/Company/Publications/TestCoverage.pdf

OP probably might find using our test coverage tool an easy way to accomplish something pretty close but easier to do. What the test coverage tool does is insert special trace-data capture statements at the beginning of every block of (unconditional) code, rather than after every individual statement. That trace data capture is actually a macro invocation, whose code we supply in source form as part of the test coverage product. While not the intended use, one could simply replace that macro call with the desired tracing, and the test coverage tool would in effect insert the desired code where it would have placed the probes. He can probably still capture the function name, and a unique code point [the test coverage tool manufactures these] as a stand-in for the line number. What he could not do is more sophisticated tasks that would be possible with DMS proper. For instance, there's no way the trace macro can get its hands on the original line number; that's lost by the time the macro is introduced by the test coverage tool. (With DMS proper, it isn't lost). But there is a way to convert the "unique code point" back into precise source location information.

EDIT 7/10/2011: OP might even find that running the test coverage as a test coverage tool might help him, too. If the test-coverage compiled application is executed, and doesn't crash, the "covered code" of that execution has run at least once and is therefore somewhat less likely to be the source of the problem. (No gaurantees: just becuase you executed doens't meant its correct). But the hint is the problem is someplace else; this would tend to eliminate code that isn't the problem.

Upvotes: 4

Mark B
Mark B

Reputation: 96291

Just to ask the obvious question: can you compile release mode with a separate symbol file?

If not, I would actually suggest a manual "binary search" approach rather than putting prints on every line. The problem with so many print statements is they can both slow down your program and unintentionally change its observable behavior. The fewer you can get away with the better.

Upvotes: 6

Related Questions