Data Flow Coverage

Question

If you write a program, it is usually possible to drive it such that all paths are covered. Hence, 100% coverage is easy to obtain (ignoring unfeasible code paths which modern compilers catch anyways).

However, 100% code coverage should imply that all variable definition-use coverage is also achieved, because variables are defined within the program and used within it. If all code is covered, all DU pairs should also be covered.

Why then, is it said that path coverage is easier to obtain, but data flow coverage is not usually possible to achieve 100% ? I do not understand why not? What can be an example of that?

Phillip Kinkade · Accepted Answer

It's easier to achieve 100% code coverage than all of the possible inputs because the set of all possible inputs can be extremely large or practically unlimited. It would take too much time to test them all.

Let's look at a simple example function:

double invert(double x) {
    return 1.0/x;
}

A unit test would could look like this:

double y = invert(5);
double expected = 1.0/5.0;
EXPECT_EQ( expected, y );

This test achieves 100% code coverage. However, it's only 1 in 1.8446744e+19 possible inputs (assuming a double is 64 bits wide).

The idea behind All-pairs Testing is that it's not practical to test every possible input, so we have to identify the ranges that would cover all cases.

With my invert() function, there are at least two sets that matter: {non-zero values} and {zero}.

We need to add another test, which covers the same code path, but has a different outcome:

EXPECT_THROWS( invert(0.0) );

Furthermore, since the test writer has to design the different possible sets of parameters to achieve full data input coverage to a test, it could be impossible to know what the correct sets are.

Consider this function:

double multiply(double x, double y);

My instinct would be to write tests for small numbers and another for big numbers, to test overflow.

However, the developer may have written it poorly, in this way:

double multiply(double x, double y) {
    if(x==0) return 0; 
    return 1.0 / ( (1.0/x) * (1.0/y) );
}

If our tests didn't use 0 for y, then we'd miss a bug. Knowledge of how the algorithms are designed is very important in understanding the proper inputs for a unit test, and that's why the programmers who write the code need to be involved in unit testing.

Data Flow Coverage

Answers (1)

Related Questions