Daniel
Daniel

Reputation: 113

C and C++ weird performance difference

Before you read my post, please take into consideration that I am new to both C and C++. I'm mostly a managed code developer.

I have two pieces of identical code (or that's what I believe at least). One in C and one in C++. The code basically checks if the number is a prime, and if it is, it will store it in a container.

C++

Main.cpp

#include <iostream>
#include <vector>
#include <time.h>

static bool isPrime(const int& number) {

    if((number & 1) == 0) {
        if(number == 2)
            return true;
        else
            return false;
    }

    for(int i = 3; (i * i) <= number; i++) {

        if((number % i) == 0)
            return false;
    }

    return number != 1;
}

int main(int argc, const char * argv[]) {

    std::vector<int> vector;
    clock_t start = clock();
    for(int i = 0; i < 30000000; i++) {

        if(isPrime(i))
            vector.push_back(i);
    }
    clock_t end = clock();
    clock_t seconds = (end - start) / CLOCKS_PER_SEC;
    std::cout << "done after " << seconds << " seconds " << std::endl;

    return 0;
}

C

Vector.c

#include <stdlib.h>


typedef struct vector_class {
    void(*push_back)(struct vector_class *vector_instance, const int *data);
    int *data;
    int length;
    int capacity;
} vector;

static void push_back(vector *vector_instance, const int *data) {

    if(vector_instance->length >= vector_instance->capacity) {

        vector_instance->capacity *= 2;
        vector_instance->data = (int*) realloc(vector_instance->data, sizeof(int) * vector_instance->capacity);
    }
    vector_instance->data[vector_instance->length] = *data;
    vector_instance->length++;
}

static void vector_constructor(vector *vector_instance) {

    vector_instance->push_back = &push_back;
    vector_instance->length = 0;
    vector_instance->capacity = 2;
    vector_instance->data = (int*)malloc(sizeof(*vector_instance->data) * vector_instance->capacity);

}

static void vector_destructor(vector *vector_instance) {

    free(vector_instance->data);
    vector_instance->length = 0;
    vector_instance->capacity = 0;
    vector_instance->data = NULL;
}

Main.c

#include <stdio.h>
#include "vector.c"
#include <time.h>

static int isPrime (const int *number) {

    if((*number & 1) == 0) {
        if(*number == 2)
            return 1;
        else
            return 0;
    }

    for(int i = 3; (i * i) <= *number; i += 2) {

        if((*number % i) == 0)
            return 0;
    }

    return *number != 1;
}

int main(int argc, const char * argv[]) {
    vector v;
    vector_constructor(&v);
    clock_t start = clock();
    for(int i = 0; i <= 30000000; i++) {

        if(isPrime(&i))
            v.push_back(&v, &i);
    }
    clock_t end = clock();
    clock_t seconds = (end - start) / CLOCKS_PER_SEC;
    printf("%lu seconds \n", seconds);

    for(int i = 0; i < v.length; i++) {

        //printf("%d \n", v.data[i]);
    }
    vector_destructor(&v);
    return 0;
}

I compile both programs on my OS X Mavericks, with the built in Clang compiler.

C++

g++ -O3 -std=c++11 Main.cpp

C

gcc -O3 -std=c99 Main.c

Both get compiled trouble free, and they also run trouble free. However..

I get different time results.

C finishes after 12 seconds

C++ finishes after 26 seconds

Can anyone point out what I am doing wrong? Thanks!

Upvotes: 0

Views: 228

Answers (1)

jschultz410
jschultz410

Reputation: 2899

Your programs are subtly different in isPrime. In your C++ program:

for(int i = 3; (i * i) <= number; i++) {

In your C program:

for(int i = 3; (i * i) <= *number; i += 2) {

So, your C++ program is computing the remainder about twice as many times as your C program, which likely explains your performance discrepancy.

Beyond that, I recommend that you not pass int by reference or pointer unless you have a good reason. Hopefully, the compiler would be smart enough to figure out that you didn't need to and optimize that out, but who knows?

Also, you want to avoid calling functions through function pointers, like you do in your C program, when you can. They usually hurt a compiler's ability to inline optimize functions. It might be the case here that the compiler is smart enough to inline the call anyway, but again who knows?

Finally, if computing all primes less than N is really what you are after and this just isn't a toy to benchmark C vs. C++, then look into the Sieve of Eratosthenes or the Sieve of Sundaram. Alternatively, you can pass your vector of already known primes into isPrime and check only against already known primes rather than all odd numbers.

Upvotes: 6

Related Questions