simulate
simulate

Reputation: 1263

When to choose multiple loops over single loop with if-statements

I have to process each element of an array the same way once and have to modify each by an unpredictable pattern afterwards aswell.

Is there any difference in performance between these snippets and why, if so?

std::vector<int> nums;
//fill nums array

for(unsigned int i = 0; i < nums.size(); ++i){
    nums[i] *= nums[i];

    if(nums[i] < 10){
        nums[i] = 0;
    }
}

std::vector<int> nums;
//fill nums array

for(unsigned int i = 0; i < nums.size(); ++i){
    nums[i] *= nums[i];
}

for(unsigned int i = 0; i < nums.size(); ++i){
    if(nums[i] < 10){
        nums[i] = 0;
    }
}

Would a different approach like this improve anything?

std::vector<int> nums;
//fill nums array

std::vector<int> flags;
flags.resize(nums.size());

for(unsigned int i = 0; i < nums.size(); ++i){
    nums[i] *= nums[i];
    flags[i] = nums[i] < 10;
}

for(unsigned int i = 0; i < flags.size(); ++i){
   nums[i] = (!flags[i]) * nums[i]; //!flags[i] is 0 if nums[i] < 10
}

Upvotes: 3

Views: 693

Answers (3)

Richard Hodges
Richard Hodges

Reputation: 69882

Test results on my machine (100 million ints in the array):

checkpoint
starting test: approach1
test took: 84ms
starting test: approach2
test took: 190ms
starting test: approach3
test took: 529ms
starting test: Bathsheba's idea
test took: 61ms

Incidentally, writing idiomatic code is often the most efficient of all. clang, gcc et. al. optimisers are fantastic:

void approach5(std::vector<int> &nums) {

    auto filter = [](auto x) { return (x < 10) ? 0 : x; };

    auto grow = [filter](auto x) { return filter(x * x); };

    std::transform(begin(nums), end(nums), begin(nums), grow);
}

Here's the code.

#include <iostream>
#include <chrono>
#include <random>
#include <array>
#include <vector>
#include <algorithm>

auto constexpr test_size = std::size_t(100'000'000);
using namespace std::literals;

void approach1(std::vector<int> &nums) {
    for (unsigned int i = 0; i < nums.size(); ++i) {
        nums[i] *= nums[i];

        if (nums[i] < 10) {
            nums[i] = 0;
        }
    }
}


void approach2(std::vector<int> &nums) {
    for (unsigned int i = 0; i < nums.size(); ++i) {
        nums[i] *= nums[i];
    }

    for (unsigned int i = 0; i < nums.size(); ++i) {
        if (nums[i] < 10) {
            nums[i] = 0;
        }
    }
}

void approach3(std::vector<int> &nums) {
    std::vector<int> flags;
    flags.resize(nums.size());

    for (unsigned int i = 0; i < nums.size(); ++i) {
        nums[i] *= nums[i];
        flags[i] = nums[i] < 10;
    }

    for (unsigned int i = 0; i < flags.size(); ++i) {
        nums[i] = (!flags[i]) * nums[i]; //!flags[i] is 0 if nums[i] < 10
    }
}

void approach4(std::vector<int> &nums) {
    for (std::size_t i = 0; i < nums.size(); ++i) {
        nums[i] *= (nums[i] <= 3 ? 0 : nums[i]);
    }
}


auto test = [](auto &&name, auto &&approach, auto &&data) {
    std::cout << "starting test: " << name << std::endl;
    auto my_data = std::vector<int>(data.begin(), data.end());
    auto now = std::chrono::high_resolution_clock::now();
    approach(my_data);
    auto then = std::chrono::high_resolution_clock::now();
    auto diff = std::chrono::duration_cast<std::chrono::milliseconds>(then - now);
    std::cout << "test took: " << diff.count() << "ms" << std::endl;
};

std::array<int, test_size> test_data;

int main() {
    std::random_device rd;
    std::default_random_engine eng(rd());
    std::uniform_int_distribution<> dist(0, 100);

    std::generate(test_data.begin(), test_data.end(), [&]() { return dist(eng); });
    std::cout << "checkpoint" << std::endl;

    test("approach1", approach1, test_data);
    test("approach2", approach2, test_data);
    test("approach3", approach3, test_data);
    test("Bathsheba's idea", approach4, test_data);
}

Upvotes: 3

Sherif.Mohamed
Sherif.Mohamed

Reputation: 19

In Algorithm analysis First pattern: time complexity is size of array O(n) Second pattern: time complexity is max between 2 loops O(n) and O(n) so it will be O(n) also.

First and second patterns are a same.

Third pattern not improve any thing.

Conclusion: 3 patterns are techniques not more.

Upvotes: 0

Bathsheba
Bathsheba

Reputation: 234715

The first one has fewer statements (which is always a good rough guide: fewer similar statements often means better performance), but above everything else, it is far more readable.

Don't micro-optimise: leave that to the compiler.

If you are in any doubt check the generated assembly / profile the performance.

Finally (apologise, for I couldn't resist), if nums was a std::vector<unsigned>, then you could have written

for (std::size_t i = 0; i < nums.size(); ++i){
    nums[i] *= (nums[i] <= 3 ? 0 : nums[i]);
}

which could help the branch predictor.

Upvotes: 4

Related Questions