Reputation: 35
I'm writing this code to make a candlestick chart and I want a red box if the open price for the day is greater than the close. I also want the box to be green if the close is higher than the open price.
if(open > close) {
boxColor = red;
} else {
boxColor = green;
}
Pseudo code is easier than an English sentence for this.
So I wrote this code first and then tried to benchmark it but I don't know how to get meaningful results.
for(int i = 0; i < history.get().close.size(); i++) {
auto open = history->open[i];
auto close = history->close[i];
int red = ((int)close - (int)open) >> ((int)sizeof(close) * 8);
int green = ((int)open - (int)close) >> ((int)sizeof(close) * 8);
gl::color(red,green,0);
gl::drawSolidRect( Rectf(vec2(i - 1, open), vec2(i + 1, close)) );
}
This is how I tried to benchmark it. Each run just shows 2ns. My main question to the community is this:
Can I actually make it faster by using a right shift and avoid a conditional branch?
#include <benchmark/reporter.h>
static void BM_red_noWork(benchmark::State& state) {
double open = (double)rand() / RAND_MAX;
double close = (double)rand() / RAND_MAX;
while (state.KeepRunning()) {
}
}
BENCHMARK(BM_red_noWork);
static void BM_red_fast_work(benchmark::State& state) {
double open = (double)rand() / RAND_MAX;
double close = (double)rand() / RAND_MAX;
while (state.KeepRunning()) {
int red = ((int)open - (int)close) >> sizeof(int) - 1;
}
}
BENCHMARK(BM_red_fast_work);
static void BM_red_slow_work(benchmark::State& state) {
double open = (double)rand() / RAND_MAX;
double close = (double)rand() / RAND_MAX;
while (state.KeepRunning()) {
int red = open > close ? 0 : 1;
}
}
BENCHMARK(BM_red_slow_work);
Thanks!
Upvotes: 1
Views: 1318
Reputation: 17131
As I stated in my comment, the compiler will do these optimizations for you. Here is a minimal compilable example:
int main() {
volatile int a = 42;
if (a <= 0) {
return 0;
} else {
return 1;
}
}
The volatile
is simply to prevent optimizations from "knowing" the value of a
and instead it forces it to be read.
This was compiled with the command g++ -O3 -S test.cpp
and it produces a file named test.s
Inside test.s is the assembly generated by the compiler (pardon AT&T syntax):
movl $42, -4(%rsp)
movl -4(%rsp), %eax
testl %eax, %eax
setg %al
movzbl %al, %eax
ret
As you can see, it is branchless. It uses testl
to set a flag if the number is <= 0
and then reads that value using setg
, moves it back into the proper register, then finally it returns.
It should be noted, at this was adapted from your code. A much better way to write this is simply:
int main() {
volatile int a = 42;
return a > 0;
}
It also generates the same assembly.
This is likely to be better than anything readable you could write directly in C++. For instance your code (hopefully corrected for bit arithmetic errors):
int main() {
volatile int a = 42;
return ~(a >> (sizeof(int) * CHAR_BIT - 1)) & 1;
}
Compiles to:
movl $42, -4(%rsp)
movl -4(%rsp), %eax
notl %eax
shrl $31, %eax
ret
Which is indeed, very slightly smaller. But it's not significantly faster. Especially not when you have a GL call right next to it. I'd rather spend 1-3 additional cycles to get readable code, rather than have to scratch my head wondering what my coworker (or me from 6 months ago, which is essentially the same thing) did.
EDIT: I should be remarked that the compiler also optimized the bit arithmetic I wrote, because I wrote it less well than I could have. The assembly is actually: (~a) >> 31
which is equivalent to the ~(a >> 31) & 1
that I wrote (at least in most implementations with an unsigned integer, see comments for details).
Upvotes: 4