Elad Shtiegmann
Elad Shtiegmann

Reputation: 211

C Programming: how to avoid code duplication without losing clarity

edit: Thanks to all repliers. I should have mentioned in my original post that I am not allowed to change any of the specifications of these functions, so solutions using assertions and/or allowing to dereference NULL are out of the question. With this in mind, I gather that it's either I go with a function pointer, or just leave the duplication as it is. For the sake of clarity I'd like to avoid function pointers this time.

original: I am trying to avoid code duplication without losing clarity. often when working on a specific assignment (Uni - undergrad) I recognize these patterns of functions return , but not always with a "great-job" solution..

What would any of you suggest I should do (pointers to functions, macros, etc.) with these three C functions that check some of their arguments in the same way to make the checking more modular (it should be more modular, right?)?

BTW these are taken directly from a HW assignment, so the details of their functionality are not concerning my question, only the arguments checking at the function's top.

 teamIsDuplicateCoachName(Team team, bool* isDuplicate) {
    TeamResult result = TEAM_SUCCESS;
    if (!team || !isDuplicate) {
        result = TEAM_NULL_ARGUMENT;
    } else if (teamEmpty(team)) {
        result = TEAM_IS_EMPTY;
    } else {
        for (int i = 0; i < team->currentFormations; ++i) {
            if (teamIsPlayerInFormation(team->formations[i], team->coach)) {
                *isDuplicate = true;
                break;
            }
        }
    }
    return result;
}

TeamResult teamGetWinRate(Team team, double* winRate) {
    TeamResult result = TEAM_SUCCESS;
    if (!team || !winRate) {
        result = TEAM_NULL_ARGUMENT;
    } else {
        int wins = 0, games = 0;
        for (int i = 0; i < team->currentFormations; ++i) {
            Formation formation = team->formations[i];
            if (formationIsComplete(formation)) {
                games += formation->timesPlayed;
                wins += formation->timesWon;
            }
        }
        double win = ( games == 0 ) ? 0 : (double) wins / games;
        assert(win >= 0 && win <= 1);
        *winRate = win;
    }
    return result;
}



TeamResult teamGetNextIncompleteFormation(Team team, Formation* formation,
        int* index) {
    TeamResult result = TEAM_SUCCESS;
    if (!team || !formation || !index) {
        result = TEAM_NULL_ARGUMENT;
    } else {
        *formation = NULL; /* default result, will be returned if there are no incomplete formations */
        for (int i = 0; i < team->currentFormations; ++i) {
            Formation formationPtr = team->formations[i];
            if (!formationIsComplete(formationPtr)) {
                *formation = formationPtr;
                *index = i;
                break;
            }
        }
    }
    return result;
}

Any advice on how (specifically) to avoid the code duplication would be appreciated.

Thanks for your time! :)

Upvotes: 2

Views: 1525

Answers (6)

Elad Shtiegmann
Elad Shtiegmann

Reputation: 211

Thanks to all repliers. I should have mentioned in my original post that I am not allowed to change any of the specifications of these functions, so solutions using assertions and/or allowing to dereference NULL are out of the question, though I'll consider them for other occasions.

With this in mind, I gather that it's either I go with a function pointer, or just leave the duplication as it is. For the sake of clarity I'd like to avoid function pointers this time.

Upvotes: 0

Paul Hankin
Paul Hankin

Reputation: 58349

It looks like it's a coding mistake to pass nulls to these functions. There's three main ways to deal with this situation.

  1. Handle the erroneous nulls and return an error value. This introduces extra code which checks the arguments to return error values, and extra code around every call site, which now has to handle the error return values. Probably none of this code is tested, since if you knew that code was mistakenly passing nulls you'd just fix it.
  2. Use assert to check validity of arguments, resulting in a clean error message, clear to read preconditions, but some extra code.
  3. Have no precondition checks, and debug segfaults when you deference a NULL.

In my experience 3 is usually the best approach. It adds zero extra code, and a segfault is usually just as easy to debug as the clean error message you'd get from 2. However, you'll find many software engineers who would prefer 2, and it's a matter of taste.

Your code, which is pattern 1, has some significant downsides. First, it's adding extra code which can't be optimised away. Second, more code means more complexity. Third, it's unclear if the functions are supposed to be able to accept broken arguments, or if the code's just there to help debugging when things are wrong.

Upvotes: 2

Sebastien
Sebastien

Reputation: 1476

This is more or less a design question. If the functions above are all static functions (or only one is extern), then the whole "bundle of function" should check the condition - execution flow-wise - once for each object and let the implementation details of lower level functions assume that input data is valid.

For example, if you go back to wherever the team is created, allocated and initialized and wherever the formation is created, allocated and initialized and build rules there that ensure that every created team exists and that no duplicate exists, you will not have to valid the input because by definition/construction it will always be. This is examples of pre conditions. Invariants would be the persistance of the truthfulness of these definitions (no function may alter invariant states upon return) and post conditions would be somewhat the opposite (for example when they are free'd but pointers still exists somewhere).

That being said, manipulating "object-like" data in C, my personnal preference is to create extern functions that creates, returns and destroys such objects. If the members are kept static within the .c files with minimal .h interface, you get something conceptually similar to object oriented programming (though you can never make members fully "private").

Upvotes: 0

There are a few techniques to reduce the redundancy you percieve, which one is applicable heavily depends on the nature of the condition you are checking. In any case, I would advise against any (preprocessor) tricks to reduce duplication which hide what is actually happening.

If you have a condition that should not happen, one concise way to check for it is to use an assert. With an assert you basically say: This condition must be true, otherwise my code has a bug, please check if my assumption is true, and kill my program immediately if it's not. This is often used like this:

#include <assert.h>

void foo(int a, int b) {
    assert((a < b) && "some error message that should appear when the assert fails (a failing assert prints its argument)");
    //do some sensible stuff assuming a is really smaller than b
}

A special case is the question whether a pointer is null. Doing something like

void foo(int* bar) {
    assert(bar);
    *bar = 3;
}

is pretty pointless, because dereferencing a null pointer will securely segfault your program on any sane platform, so the following will just as securely stop your program:

void foo(int* bar) {
    *bar = 3;
}

Language lawyers may not be happy with what I'm saying because, according to the standard, dereferencing a null pointer is undefined behaviour, and technically the compiler would be allowed to produce code that formats your harddrive. However, dereferencing a null pointer is such a common error that you can expect your compiler not to do stupid things with it, and you can expect your system to take special care to ensure that the hardware will scream if you try to do it. This hardware check comes for free, the assert takes a few cycles to check.

The assert (and segfaulting null pointers), however, is only suitable for checking for fatal conditions. If you are just checking for a condition that makes any further work inside a function pointless, I would not hesitate to use an early return. It is usually much more readable, especially since syntax highlighting readily reveals the return statements to the reader:

void foo(int a, int b) {
    if(a >= b) return;
    //do something sensible assuming a < b
}

With this paradigm, your first function would look like this:

TeamResult teamIsDuplicateCoachName(Team team, bool* isDuplicate) {
    if(!team || !isDuplicate) return TEAM_NULL_ARGUMENT;
    if(teamEmpty(team)) return TEAM_IS_EMPTY;

    for (int i = 0; i < team->currentFormations; ++i) {
        if (teamIsPlayerInFormation(team->formations[i], team->coach)) {
            *isDuplicate = true;
            break;
        }
    }
    return TEAM_SUCCESS;
}

I believe, this is much more clear and concise than the version with the if around the body.

Upvotes: 1

noelicus
noelicus

Reputation: 15055

I would create a function to check the team object:

TeamResult TeamPtrCheck(Team *team)
{
    if (team == NULL)
        return TEAM_NULL_ARGUMENT;
    else if (teamEmpty(team))
        return TEAM_IS_EMPTY;
    else
        return TEAM_SUCCESS;
}

And then reference that + your other checks at the top of each function, for example

TeamResult = TeamPtrCheck(team);
if (TeamResult != TEAM_SUCCESS)
    return TeamResult;
if (winRate == NULL)
    return TEAM_NULL_ARGUMENT;

Otherwise, if each function is different then leave the checks as different!

Upvotes: 1

It&#39;sPete
It&#39;sPete

Reputation: 5221

If you are concerned about the duplication of the NULL checks at the start of each function, I wouldn't be. It makes it clear to the user that you are simply doing input validation prior to doing any work. No need to worry about the few lines.

In general, don't sweat the small stuff like this.

Upvotes: 1

Related Questions