user15051177
user15051177

Reputation:

Optimization in C

I've been trying to optimize some simple code and I try two kind of optimizations, loop enrolling and memory aliasing.
My original code:

int paint(char *dst, unsigned n, char *src, char bias)
{
    unsigned i;
    for (i=0;i<n;i++) {
        *dst++ = bias + *src++;
    }
    return 0;
}

My optimizated code after loop enrolling:

int paint(char *dst, unsigned n, char *src, char bias)
{
    unsigned i;
    for (i=0;i<n;i+=2) {
       *dst++ = bias + *src++;
        *dst++ = bias + *src++;
    }
    return 0;
}

How after this I can optimize the code with memory aliasing? And there are another good optimizations for this code? (Like cast the pointers to long pointers to copy quickly)

Upvotes: 0

Views: 225

Answers (2)

Vlad Feinstein
Vlad Feinstein

Reputation: 11311

Are you only concerned about performance? What about correctness?

Judging by the name of your function paint and the variable bias (and using my crystal ball), I guess you need to add with saturation (in case of overflow). This can be dune by using intrinsics for paddusb (https://www.felixcloutier.com/x86/paddusb:paddusw): https://software.intel.com/sites/landingpage/IntrinsicsGuide/#expand=774,433,4179,4179&cats=Arithmetic&text=paddusb

Upvotes: 1

Zan Lynx
Zan Lynx

Reputation: 54325

Optimization in C is easier than this.

cc -Wall -W -pedantic -O3 -march=native -flto source.c

That will unroll any loops that need to be unrolled. Doing your own unrolling, Duff's Device and other tricks are outdated and pretty useless.

As for aliasing, your function uses two char* parameters. If they are guaranteed to never point into the same arrays then you can use the restrict keyword. That will allow the optimizer to assume more things about the code and use vectorized instructions.

Check out the assembly produced here: https://godbolt.org/z/xMfebr or https://godbolt.org/z/j1xMYz

Can you manage to do all of that by hand? Probably not.

Upvotes: 1

Related Questions