Filippo Mazzarotto
Filippo Mazzarotto

Reputation: 33

Avoid code duplication between two functions, but one returns in the middle of the code

Lets say i have two functions inside a class:

vector<int> getAllPossibleNumbers() {
    vector<int> nums;
    for (int i=0; i < MAX; i++)
        if (isAPossibleNumber(i))
            nums.push_back(i);
    return nums;
}

bool hasAtLeastOnePossibleNumber() {
    for (int i=0; i < MAX; ++i)
        if (isAPossibleNumber(i))
            return true;
    return false;
}

I could rewrite the second function as

bool hasAtLeastOnePossibleNumber() {
    return getAllPossibleNumbers().size() > 0;
}

But it's obvious, especially with a very high MAX number, how the first is much more efficent.

So, how can I avoid code duplication of this kind?

I would prefer a general solution, that can work with more complicated functions and can compile with C++11.

Upvotes: 1

Views: 118

Answers (3)

cigien
cigien

Reputation: 60208

You don't actually have any code duplication. It just looks like you do, because you've used index based for loops to implement different algorithms. If you write loops like this everywhere, it's going to look like you have repeated code.

What you should do is use the appropriate algorithms

vector<int> getAllPossibleNumbers() 
{
    vector<int> nums;
    std::ranges::copy_if(std::views::iota(0, MAX), 
                         std::back_inserter(nums),
                         isAPossibleNumber);
    return nums;
}

bool hasAtLeastOnePossibleNumber() 
{
    return std::ranges::any_of(std::views::iota(0, MAX),
                               isAPossibleNumber);    
}

And now there's nothing really to deduplicate.

*This is C++20 code, but you can do pretty much the same thing in older revisions.

Upvotes: 1

Alan Birtles
Alan Birtles

Reputation: 36379

One option would be to move most of the logic into a third function with a callback to the outer functions which gives the numbers. The result of the callback can be used to determine whether to continue the loop:

template <typename Callback>
void checkNumbers(Callback callback)
{
    for (int i=0; i < MAX; i++)
    {
        if (isAPossibleNumber(i))
        {
            if (!callback(i))
            {
                break;
            }
        }
    }
}

std::vector<int> getAllPossibleNumbers() {
    std::vector<int> nums;
    checkNumbers([&](int i){ nums.push_back(i); return true; });
    return nums;
}

bool hasAtLeastOnePossibleNumber() {
    bool result = false;
    checkNumbers([&](int i){ result = true; return false; });
    return result;
}

The extra function should be optimised away by modern compilers.

Upvotes: 3

463035818_is_not_an_ai
463035818_is_not_an_ai

Reputation: 122133

I wouldnt bother too much to refactor the code, because the duplciation is not that drastic. The two functions really do different things. However, if you write a function that only returns a single int:

const int MAX = 100;
bool isAPossibleNumber(int i){ return true;}

int getNextPossibleNumber(int start=0,int stop = MAX) {
    for (int i=0; i < stop; i++) {
        if (isAPossibleNumber(i)) return i;
    }
    return stop;
}

Then one of the functions will be simple and short

bool hasAtLeastOnePossibleNumber() {
    return getNextPossibleNumber(0,MAX) != MAX;
}

While the other becomes a little more verbose

std::vector<int> getAllPossibleNumbers() {
    std::vector<int> nums;
    int start = 0;
    while (start < MAX) {
        auto i = getNextPossibleNumber(start);
        if (i != MAX) nums.push_back(i);
        start = i+1;
    }
    return nums;
}

Note that in total this is more code, so its a trade-off. If you don't need getNextPossibleNumber also elsewhere I would probably leave your code as is.


Also note that while this reduces code duplication a little, it does not reduce the amount of work that needs to be done when you do for example:

if (hasPossibleNumber()) {
    auto vec = getAllPossibleNumbers();
}

In the worst case the only possible number is MAX-1 and both functions will iterate the whole range. Instead of letting both functions do the work maybe you are better off at caching the result and reusing it when either of the function is called again.

Upvotes: 2

Related Questions