Reputation: 182
I am trying to speed up the execution of the following code with OpenMP. The code is for calculating a mandelbrot and output it to canvas.
The code works fine single threaded, but I want to use OpenMP to make it faster. I tried all sorts of combination of private and shared variables but nothing seems to work so far. The code always runs a little slower with OpenMP than without it (50 000 iterations - 2s slower).
I am using Ubuntu 16.04 and compiling with GCC.
void calculate_mandelbrot(GLubyte *canvas, GLubyte *color_buffer, uint32_t w, uint32_t h, mandelbrot_f x0, mandelbrot_f x1, mandelbrot_f y0, mandelbrot_f y1, uint32_t max_iter) {
mandelbrot_f dx = (x1 - x0) / w;
mandelbrot_f dy = (y1 - y0) / h;
uint16_t esc_time;
int i, j;
mandelbrot_f x, y;
//timer start
clock_t begin = clock();
#pragma omp parallel for private(i,j,x,y, esc_time) shared(canvas, color_buffer)
for(i = 0; i < w; ++i) {
x = x0 + i * dx;
for(j = 0; j < h; ++j) {
y = y1 - j * dy;
esc_time = escape_time(x, y, max_iter);
canvas[ GET_R(i, j, w) ] = color_buffer[esc_time * 3];
canvas[ GET_G(i, j, w) ] = color_buffer[esc_time * 3 + 1];
canvas[ GET_B(i, j, w) ] = color_buffer[esc_time * 3 + 2];
}
}
//time calculation
clock_t end = clock();
double time_spent = (double)(end - begin) / CLOCKS_PER_SEC;
printf("%f\n",time_spent );
}
escape_time function which the code uses:
inline uint16_t escape_time(mandelbrot_f x0, mandelbrot_f y0, uint32_t max_iter) {
mandelbrot_f x = 0.0;
mandelbrot_f y = 0.0;
mandelbrot_f xtemp;
uint16_t iteration = 0;
while((x*x + y*y < 4) && (iteration < max_iter)) {
xtemp = x*x - y*y + x0;
y = 2*x*y + y0;
x = xtemp;
iteration++;
}
return iteration;
}
The code is from this repository https://github.com/hortont424/mandelbrot
Upvotes: 0
Views: 254
Reputation: 182
As it was suggested my problem was caused by using the clock() function, which measures CPU time. Using omp_get_wtime() instead solved my problem.
Upvotes: 0
Reputation: 51463
First, like hinted in the comment, use omp_get_wtime()
instead of clock() (it will give you the number of clock ticks accumulated across all threads) measure the time. Second, If I recall correctly, this algorithm have load balancing problems, so try to use a dynamic scheduling:
//timer start
double begin = omp_get_wtime();
#pragma omg parallel for private(j,x,y, esc_time) schedule(dynamic, 1)
for(i = 0; i < w; ++i) {
x = x0 + i * dx;
for(j = 0; j < h; ++j) {
y = y1 - j * dy;
esc_time = escape_time(x, y, max_iter);
canvas[ GET_R(i, j, w) ] = color_buffer[esc_time * 3];
canvas[ GET_G(i, j, w) ] = color_buffer[esc_time * 3 + 1];
canvas[ GET_B(i, j, w) ] = color_buffer[esc_time * 3 + 2];
}
}
//time calculation
double end = omp_get_wtime();
double time_spent = (double)(end - begin) / CLOCKS_PER_SEC;
printf("%f\n",time_spent );
Upvotes: 1