Reputation: 35149
I'd like to calculate x/y where x and y are both signed integers, and get a result rounded to the nearest integer. Specifically, I'd like a function rquotient(x, y)
using integer-only arithmetic such that:
ASSERT(rquotient(59, 4) == 15);
ASSERT(rquotient(59, -4) == -15);
ASSERT(rquotient(-59, 4) == -15);
ASSERT(rquotient(-59, -4) == 15);
ASSERT(rquotient(57, 4) == 14);
ASSERT(rquotient(57, -4) == -14);
ASSERT(rquotient(-57, 4) == -14);
ASSERT(rquotient(-57, -4) == 14);
I've looked to S.O. for a solution and found the following (each with their own shortcoming):
Upvotes: 6
Views: 1708
Reputation: 351
For rounding to the nearest integer, we're only interested in the remainder if it is larger than half of the numerator. In other words; we're only interested in the decimal part if it's greater or equal to 0.5. We don't even need to know the value of the decimal part. We only need to know if it is greater or equal to 0.5. The modulo operation is expensive, so an alternative would be to multiply the numerator by 2 before doing the integer division. Or to left-shift it by 1, which is quicker, but gcc does that substitution for you.
After this division, the least-significant bit will contain the remaining 0.5. All you need to do is to add that to the result after shifting it back.
Below a code example, to make it more visual:
unsigned int uintdiv1(unsigned int numerator, unsigned int denominator)
{
unsigned int div;
div = (numerator << 1) / denominator;
if (div & 1)
return (div >> 1) + 1;
return (div >> 1);
}
This version might be a bit quicker, and can be converted to a macro more easily:
unsigned int uintdiv2(unsigned int numerator, unsigned int denominator)
{
unsigned int div;
div = (numerator << 1) / denominator;
div += (div & 1) << 1;
return (div >> 1);
}
These examples work for unsigned integers, but you requested a solution for signed integers. That makes things a little more complicated, but not much.
If the result is negative, the default rounding already works in our favor. So we only need to test if the result is expected to be positive, and add the remaining 0.5 only if that's the case.
Again a code example, to make it more visual:
int intdiv1(int numerator, int denominator)
{
int div;
div = (numerator << 1) / denominator;
if (!((numerator < 0) ^ (denominator < 0)) && (div & 1))
return (div >> 1) + 1;
return (div >> 1);
}
The really brave among us, could write it like this, but I wouldn't recommend it, as it may produce unexpected result with non-gcc compilers:
int intdiv2(int numerator, int denominator)
{
int div;
div = (numerator << 1) / denominator;
div += (!((numerator < 0) ^ (denominator < 0)) && (div & 1)) << 1;
return (div >> 1);
}
This algorithm could be implemented in a macro, which is more versatile, because macros are handled by the preprocessor and are type agnostic.
#define INTDIV(n, d) (((((n) << 1) / (d)) + ((!(((n) < 0) ^ ((d) < 0)) && ((((n) << 1) / (d)) & 1)) ? (1 << 1) : 0)) >> 1)
What you really need to be aware of, is that the implementations above do not deal with overflow errors. If you plug in a very large number, and it is multiplied by 2, it may overflow, causing all kinds of havoc. This can be mitigated to some extend by using 64-bit arithmetic when using the macro:
#include <stdint.h>
int result = INTDIV((int64_t)n, d);
And finally the proof that it works:
#include <stdio.h>
#include <stdlib.h>
#include <math.h>
#define INTDIV(n, d) (((((n) << 1) / (d)) + ((!(((n) < 0) ^ ((d) < 0)) && ((((n) << 1) / (d)) & 1)) ? (1 << 1) : 0)) >> 1)
static unsigned int uintdiv1(unsigned int numerator, unsigned int denominator)
{
unsigned int div;
div = (numerator << 1) / denominator;
if (div & 1)
return (div >> 1) + 1;
return (div >> 1);
}
static unsigned int uintdiv2(unsigned int numerator, unsigned int denominator)
{
unsigned int div;
div = (numerator << 1) / denominator;
div += (div & 1) << 1;
return (div >> 1);
}
static int intdiv1(int numerator, int denominator)
{
int div;
div = (numerator << 1) / denominator;
if (!((numerator < 0) ^ (denominator < 0)) && (div & 1))
return (div >> 1) + 1;
return (div >> 1);
}
static int intdiv2(int numerator, int denominator)
{
int div;
div = (numerator << 1) / denominator;
div += (!((numerator < 0) ^ (denominator < 0)) && (div & 1)) << 1;
return (div >> 1);
}
static void pruintdiv1(unsigned int numerator, unsigned int denominator)
{
unsigned int expected = roundf((float)numerator / denominator);
unsigned int result = uintdiv1(numerator, denominator);
printf("%u / %u = %u, uintdiv1(%u, %u) = %u%s\n", numerator, denominator, expected, numerator, denominator, result, (result != expected) ? " ERROR!" : "");
}
static void pruintdiv2(unsigned int numerator, unsigned int denominator)
{
unsigned int expected = roundf((float)numerator / denominator);
unsigned int result = uintdiv2(numerator, denominator);
printf("%u / %u = %u, uintdiv2(%u, %u) = %u%s\n", numerator, denominator, expected, numerator, denominator, result, (result != expected) ? " ERROR!" : "");
}
static void printdiv1(int numerator, int denominator)
{
int expected = roundf((float)numerator / denominator);
int result = intdiv1(numerator, denominator);
printf("%d / %d = %d, intdiv1(%d, %d) = %d%s\n", numerator, denominator, expected, numerator, denominator, result, (result != expected) ? " ERROR!" : "");
}
static void printdiv2(int numerator, int denominator)
{
int expected = roundf((float)numerator / denominator);
int result = intdiv2(numerator, denominator);
printf("%d / %d = %d, intdiv2(%d, %d) = %d%s\n", numerator, denominator, expected, numerator, denominator, result, (result != expected) ? " ERROR!" : "");
}
static void prINTDIV(int numerator, int denominator)
{
int expected = roundf((float)numerator / denominator);
int result = INTDIV(numerator, denominator);
printf("%d / %d = %d, INTDIV(%d, %d) = %d%s\n", numerator, denominator, expected, numerator, denominator, result, (result != expected) ? " ERROR!" : "");
}
int main(int argc, char* argv[])
{
pruintdiv1(57, 4);
pruintdiv1(58, 4);
pruintdiv1(59, 4);
pruintdiv1(60, 4);
pruintdiv1(61, 4);
pruintdiv1(62, 4);
printf("\n");
pruintdiv2(57, 4);
pruintdiv2(58, 4);
pruintdiv2(59, 4);
pruintdiv2(60, 4);
pruintdiv2(61, 4);
pruintdiv2(62, 4);
printf("\n");
printdiv1(57, 4);
printdiv1(58, 4);
printdiv1(59, 4);
printdiv1(60, 4);
printdiv1(61, 4);
printdiv1(62, 4);
printf("\n");
printdiv1(-57, 4);
printdiv1(-58, 4);
printdiv1(-59, 4);
printdiv1(-60, 4);
printdiv1(-61, 4);
printdiv1(-62, 4);
printf("\n");
printdiv1(57, -4);
printdiv1(58, -4);
printdiv1(59, -4);
printdiv1(60, -4);
printdiv1(61, -4);
printdiv1(62, -4);
printf("\n");
printdiv1(-57, -4);
printdiv1(-58, -4);
printdiv1(-59, -4);
printdiv1(-60, -4);
printdiv1(-61, -4);
printdiv1(-62, -4);
printf("\n");
printdiv2(57, 4);
printdiv2(58, 4);
printdiv2(59, 4);
printdiv2(60, 4);
printdiv2(61, 4);
printdiv2(62, 4);
printf("\n");
printdiv2(-57, 4);
printdiv2(-58, 4);
printdiv2(-59, 4);
printdiv2(-60, 4);
printdiv2(-61, 4);
printdiv2(-62, 4);
printf("\n");
printdiv2(57, -4);
printdiv2(58, -4);
printdiv2(59, -4);
printdiv2(60, -4);
printdiv2(61, -4);
printdiv2(62, -4);
printf("\n");
printdiv2(-57, -4);
printdiv2(-58, -4);
printdiv2(-59, -4);
printdiv2(-60, -4);
printdiv2(-61, -4);
printdiv2(-62, -4);
printf("\n");
prINTDIV(57, 4);
prINTDIV(58, 4);
prINTDIV(59, 4);
prINTDIV(60, 4);
prINTDIV(61, 4);
prINTDIV(62, 4);
printf("\n");
prINTDIV(-57, 4);
prINTDIV(-58, 4);
prINTDIV(-59, 4);
prINTDIV(-60, 4);
prINTDIV(-61, 4);
prINTDIV(-62, 4);
printf("\n");
prINTDIV(57, -4);
prINTDIV(58, -4);
prINTDIV(59, -4);
prINTDIV(60, -4);
prINTDIV(61, -4);
prINTDIV(62, -4);
printf("\n");
prINTDIV(-57, -4);
prINTDIV(-58, -4);
prINTDIV(-59, -4);
prINTDIV(-60, -4);
prINTDIV(-61, -4);
prINTDIV(-62, -4);
return EXIT_SUCCESS;
}
Build with:
$ gcc intdiv.c -Wall -pedantic -lm -o intdiv
And let her rip:
robert@prodesk:~/src/test/intdiv$ ./intdiv
57 / 4 = 14, uintdiv1(57, 4) = 14
58 / 4 = 15, uintdiv1(58, 4) = 15
59 / 4 = 15, uintdiv1(59, 4) = 15
60 / 4 = 15, uintdiv1(60, 4) = 15
61 / 4 = 15, uintdiv1(61, 4) = 15
62 / 4 = 16, uintdiv1(62, 4) = 16
57 / 4 = 14, uintdiv2(57, 4) = 14
58 / 4 = 15, uintdiv2(58, 4) = 15
59 / 4 = 15, uintdiv2(59, 4) = 15
60 / 4 = 15, uintdiv2(60, 4) = 15
61 / 4 = 15, uintdiv2(61, 4) = 15
62 / 4 = 16, uintdiv2(62, 4) = 16
57 / 4 = 14, intdiv1(57, 4) = 14
58 / 4 = 15, intdiv1(58, 4) = 15
59 / 4 = 15, intdiv1(59, 4) = 15
60 / 4 = 15, intdiv1(60, 4) = 15
61 / 4 = 15, intdiv1(61, 4) = 15
62 / 4 = 16, intdiv1(62, 4) = 16
-57 / 4 = -14, intdiv1(-57, 4) = -14
-58 / 4 = -15, intdiv1(-58, 4) = -15
-59 / 4 = -15, intdiv1(-59, 4) = -15
-60 / 4 = -15, intdiv1(-60, 4) = -15
-61 / 4 = -15, intdiv1(-61, 4) = -15
-62 / 4 = -16, intdiv1(-62, 4) = -16
57 / -4 = -14, intdiv1(57, -4) = -14
58 / -4 = -15, intdiv1(58, -4) = -15
59 / -4 = -15, intdiv1(59, -4) = -15
60 / -4 = -15, intdiv1(60, -4) = -15
61 / -4 = -15, intdiv1(61, -4) = -15
62 / -4 = -16, intdiv1(62, -4) = -16
-57 / -4 = 14, intdiv1(-57, -4) = 14
-58 / -4 = 15, intdiv1(-58, -4) = 15
-59 / -4 = 15, intdiv1(-59, -4) = 15
-60 / -4 = 15, intdiv1(-60, -4) = 15
-61 / -4 = 15, intdiv1(-61, -4) = 15
-62 / -4 = 16, intdiv1(-62, -4) = 16
57 / 4 = 14, intdiv2(57, 4) = 14
58 / 4 = 15, intdiv2(58, 4) = 15
59 / 4 = 15, intdiv2(59, 4) = 15
60 / 4 = 15, intdiv2(60, 4) = 15
61 / 4 = 15, intdiv2(61, 4) = 15
62 / 4 = 16, intdiv2(62, 4) = 16
-57 / 4 = -14, intdiv2(-57, 4) = -14
-58 / 4 = -15, intdiv2(-58, 4) = -15
-59 / 4 = -15, intdiv2(-59, 4) = -15
-60 / 4 = -15, intdiv2(-60, 4) = -15
-61 / 4 = -15, intdiv2(-61, 4) = -15
-62 / 4 = -16, intdiv2(-62, 4) = -16
57 / -4 = -14, intdiv2(57, -4) = -14
58 / -4 = -15, intdiv2(58, -4) = -15
59 / -4 = -15, intdiv2(59, -4) = -15
60 / -4 = -15, intdiv2(60, -4) = -15
61 / -4 = -15, intdiv2(61, -4) = -15
62 / -4 = -16, intdiv2(62, -4) = -16
-57 / -4 = 14, intdiv2(-57, -4) = 14
-58 / -4 = 15, intdiv2(-58, -4) = 15
-59 / -4 = 15, intdiv2(-59, -4) = 15
-60 / -4 = 15, intdiv2(-60, -4) = 15
-61 / -4 = 15, intdiv2(-61, -4) = 15
-62 / -4 = 16, intdiv2(-62, -4) = 16
57 / 4 = 14, INTDIV(57, 4) = 14
58 / 4 = 15, INTDIV(58, 4) = 15
59 / 4 = 15, INTDIV(59, 4) = 15
60 / 4 = 15, INTDIV(60, 4) = 15
61 / 4 = 15, INTDIV(61, 4) = 15
62 / 4 = 16, INTDIV(62, 4) = 16
-57 / 4 = -14, INTDIV(-57, 4) = -14
-58 / 4 = -15, INTDIV(-58, 4) = -15
-59 / 4 = -15, INTDIV(-59, 4) = -15
-60 / 4 = -15, INTDIV(-60, 4) = -15
-61 / 4 = -15, INTDIV(-61, 4) = -15
-62 / 4 = -16, INTDIV(-62, 4) = -16
57 / -4 = -14, INTDIV(57, -4) = -14
58 / -4 = -15, INTDIV(58, -4) = -15
59 / -4 = -15, INTDIV(59, -4) = -15
60 / -4 = -15, INTDIV(60, -4) = -15
61 / -4 = -15, INTDIV(61, -4) = -15
62 / -4 = -16, INTDIV(62, -4) = -16
-57 / -4 = 14, INTDIV(-57, -4) = 14
-58 / -4 = 15, INTDIV(-58, -4) = 15
-59 / -4 = 15, INTDIV(-59, -4) = 15
-60 / -4 = 15, INTDIV(-60, -4) = 15
-61 / -4 = 15, INTDIV(-61, -4) = 15
-62 / -4 = 16, INTDIV(-62, -4) = 16
Upvotes: 0
Reputation: 144520
Here is a solution using integer arithmetic that computes the correct result for all values in the defined range: x
and y
can be any int
value with y != 0 && !(x == INT_MIN && y == -1)
.
Other integer based solutions behave incorrectly for values too close to INT_MIN
and/or INT_MAX
.
// simpler function if x >= 0 and y > 0
int rquotient_UU(int x, int y) {
int quo = x / y;
int rem = x % y;
return quo + (rem > ((y - 1) >> 1));
}
// generic function for y != 0 and !(x == INT_MIN && y == -1)
int rquotient_SS(int x, int y) {
int quo = x / y;
int rem = x % y;
if (rem == 0)
return quo;
// quo * y + rem = x
if (rem > 0) {
if (y > 0) {
return quo + (rem > (y - 1) / 2);
} else {
return quo - (rem > -((y + 1) / 2));
}
} else {
if (y > 0) {
return quo - (rem < -((y - 1) / 2));
} else {
return quo + (rem < ((y + 1) / 2));
}
}
}
These functions are only marginally slower than the ones tested by Jonathan Leffler. I expanded his test bench to include negative values and got this output on my old laptop:
rquotient_UU: 9.409108 (278977174548) rquotient_SS: 12.851408 (278977174548) rquotient_uu: 8.734572 (278977174548) rquotient_su: 8.700956 (278977174548) rquotient_ss: 12.079210 (278977174548) rquotient_dd: 12.554621 (278977174548)
Upvotes: 2
Reputation: 753455
The code presented here tests the performance of the 3 suggested functions in the answer by fearless_fool and the solution in the answer by Ayxan. The functions are modified to always take int
arguments (the const
in int const x
is not needed), but the test code only uses test values in the range where both x
and y
are non-negative.
The code uses a set of timing functions available in my SOQ (Stack Overflow Questions) repository on GitHub as files timer.c
and timer.h
in the src/libsoq sub-directory.
#define NDEBUG 1
#include "timer.h"
#include <assert.h>
#include <limits.h>
#include <math.h>
#include <stdio.h>
#include <stdlib.h>
/* JL: added static to rquotient_xx functions */
/* JL: removed two const qualifiers */
static
int rquotient_dd(int x, int y)
{
return (int)round((double)x / y);
}
/* JL: removed unsigned - added assert */
static
int rquotient_uu(int x, int y)
{
assert(x >= 0 && y > 0);
return (x + y / 2) / y;
}
/* JL: removed unsigned - added assert */
static
int rquotient_su(int x, int y)
{
assert(y > 0);
if (x > 0)
return (x + y / 2) / y;
else
return (x - y / 2) / y;
}
static
int rquotient_ss(int x, int y)
{
if ((x ^ y) > 0)
return (x + y / 2) / y;
else
return (x - y / 2) / y;
}
typedef int (*Divider)(int x, int y);
static void test_harness(const char *tag, Divider function)
{
Clock clk;
unsigned long long accumulator = 0;
clk_init(&clk);
clk_start(&clk);
for (int i = 1; i < INT_MAX / 1024; i += 13)
{
int max_div = i / 4;
if (max_div == 0)
max_div = 1;
for (int j = 1; j < max_div; j += 15)
accumulator += (*function)(i, j);
}
clk_stop(&clk);
char buffer[32];
printf("%s: %10s (%llu)\n", tag, clk_elapsed_us(&clk, buffer, sizeof(buffer)), accumulator);
}
int main(void)
{
for (int i = 0; i < 10; i++)
{
test_harness("rquotient_uu", rquotient_uu);
test_harness("rquotient_su", rquotient_su);
test_harness("rquotient_ss", rquotient_ss);
test_harness("rquotient_dd", rquotient_dd);
}
return 0;
}
The use of accumulator
serves two important purposes. First, it checks that the different computations produce the same results. Secondly, it ensures that the compiler cannot optimize the loops away — the accumulated value must be printed. It is reassuring to see that the accumulated value is the same on all tests. The oddball constants (INT_MAX / 1024
, 13
, 15
) are guessed values that yield reasonable times on the test machine — they mean the tests cover quite a lot of values, without taking inappropriately long times.
I ran the tests on a MacBook Pro (15 inch, 2017 — with a 2.9 GHz Intel Core i7 chip and 16 GiB of 2133 Mhz LPDDR3 RAM) running macOS 10.14.6 Mojave, compiled with (home-built) GCC 9.2.0 and the Xcode 11.3.1 toolchain.
$ gcc -O3 -g -I./inc -std=c11 -Wall -Wextra -Werror -Wmissing-prototypes -Wstrict-prototypes \
> iround53.c -o iround53 -L./lib -lsoq
$
One set of timing results was:
rquotient_uu: 6.272698 (286795780245)
rquotient_su: 6.257373 (286795780245)
rquotient_ss: 6.221263 (286795780245)
rquotient_dd: 10.956196 (286795780245)
rquotient_uu: 6.247602 (286795780245)
rquotient_su: 6.289057 (286795780245)
rquotient_ss: 6.258776 (286795780245)
rquotient_dd: 10.878083 (286795780245)
rquotient_uu: 6.256511 (286795780245)
rquotient_su: 6.286257 (286795780245)
rquotient_ss: 6.323997 (286795780245)
rquotient_dd: 11.055200 (286795780245)
rquotient_uu: 6.256689 (286795780245)
rquotient_su: 6.302265 (286795780245)
rquotient_ss: 6.296409 (286795780245)
rquotient_dd: 10.943110 (286795780245)
rquotient_uu: 6.239497 (286795780245)
rquotient_su: 6.238150 (286795780245)
rquotient_ss: 6.195744 (286795780245)
rquotient_dd: 10.975971 (286795780245)
rquotient_uu: 6.252275 (286795780245)
rquotient_su: 6.218718 (286795780245)
rquotient_ss: 6.241050 (286795780245)
rquotient_dd: 10.986962 (286795780245)
rquotient_uu: 6.254244 (286795780245)
rquotient_su: 6.213412 (286795780245)
rquotient_ss: 6.280628 (286795780245)
rquotient_dd: 10.963290 (286795780245)
rquotient_uu: 6.237975 (286795780245)
rquotient_su: 6.278504 (286795780245)
rquotient_ss: 6.286199 (286795780245)
rquotient_dd: 10.984483 (286795780245)
rquotient_uu: 6.219504 (286795780245)
rquotient_su: 6.208329 (286795780245)
rquotient_ss: 6.251772 (286795780245)
rquotient_dd: 10.983716 (286795780245)
rquotient_uu: 6.369181 (286795780245)
rquotient_su: 6.362766 (286795780245)
rquotient_ss: 6.299449 (286795780245)
rquotient_dd: 11.028050 (286795780245)
When analyzed, the mean and sample standard deviation for the different functions are:
Function Count Mean Standard deviation
rquotient_uu 10 6.260618 0.040679 (sample)
rquotient_su 10 6.265483 0.048249 (sample)
rquotient_ss 10 6.265529 0.039216 (sample)
rquotient_dd 10 10.975506 0.047673 (sample)
It doesn't take much statistical knowledge to see that there is essentially no performance difference between the three 'all integer' functions, because the difference between the three means is much less than one standard deviation (and to be significant, it would need to be more than one standard deviation). Nor does it take much skill to observe that converting to double
, dividing, rounding, and converting back to integer takes almost twice as long as the all-integer versions. In times (long) past, the integer vs floating-point discrepancy could have been a lot larger. There is a modest amount of overhead in the loop calculations and accumulation; that would widen the disparity between the integer and floating-point computations.
The machine running the test had various programs open in the background, but there were no videos playing, the browser was showing Stack Overflow rather than advert-laden pages, and I was tinkering on a cell phone while the test ran on the laptop. One attempted test run, during which I flicked between pages on the browser, showed much more erratic timing (longer times while I was using the browser, even though it is a multi-core machine).
Other tests with the condition if ((x ^ y) > 0)
corrected to if ((x ^ y) >= 0)
yielded slightly different timing results (but the same value for accumulator
):
rquotient_su 10 6.272791 0.037206
rquotient_dd 10 9.396147 0.047195
rquotient_uu 10 6.293301 0.056585
rquotient_ss 10 6.271035 0.052786
rquotient_su 10 6.187112 0.131749
rquotient_dd 10 9.100924 0.064599
rquotient_uu 10 6.127121 0.092406
rquotient_ss 10 6.203070 0.219747
rquotient_su 10 6.171390 0.133949
rquotient_dd 10 9.195283 0.124936
rquotient_uu 10 6.214054 0.177490
rquotient_ss 10 6.166569 0.138124
The performance difference for the floating-point arithmetic is not quite so pronounced, but still definitively in favour of integer arithmetic. The last of those tests, in particular, suggest there was some other activity on the machine while the tests were running — though that wasn't me looking at web pages or anything.
-ffast-math
Ayxan asked:
I wonder if
-ffast-math
would have made a difference.
I recompiled with the extra option, and it does indeed make a difference. Note that the original code was compiled with -O3
— it was optimized. However, the raw data from a run with -ffast-math
was:
rquotient_uu: 6.162182 (286795780245)
rquotient_su: 6.068469 (286795780245)
rquotient_ss: 6.041566 (286795780245)
rquotient_dd: 4.568538 (286795780245)
rquotient_uu: 6.143200 (286795780245)
rquotient_su: 6.071906 (286795780245)
rquotient_ss: 6.063543 (286795780245)
rquotient_dd: 4.543419 (286795780245)
rquotient_uu: 6.115283 (286795780245)
rquotient_su: 6.083157 (286795780245)
rquotient_ss: 6.063975 (286795780245)
rquotient_dd: 4.536071 (286795780245)
rquotient_uu: 6.078680 (286795780245)
rquotient_su: 6.072075 (286795780245)
rquotient_ss: 6.104850 (286795780245)
rquotient_dd: 4.585272 (286795780245)
rquotient_uu: 6.084941 (286795780245)
rquotient_su: 6.080311 (286795780245)
rquotient_ss: 6.069046 (286795780245)
rquotient_dd: 4.563945 (286795780245)
rquotient_uu: 6.075380 (286795780245)
rquotient_su: 6.236980 (286795780245)
rquotient_ss: 6.210127 (286795780245)
rquotient_dd: 4.787269 (286795780245)
rquotient_uu: 6.406603 (286795780245)
rquotient_su: 6.378812 (286795780245)
rquotient_ss: 6.194098 (286795780245)
rquotient_dd: 4.589568 (286795780245)
rquotient_uu: 6.243652 (286795780245)
rquotient_su: 6.132142 (286795780245)
rquotient_ss: 6.079181 (286795780245)
rquotient_dd: 4.595330 (286795780245)
rquotient_uu: 6.070584 (286795780245)
rquotient_su: 6.081373 (286795780245)
rquotient_ss: 6.075867 (286795780245)
rquotient_dd: 4.558105 (286795780245)
rquotient_uu: 6.106258 (286795780245)
rquotient_su: 6.091108 (286795780245)
rquotient_ss: 6.128787 (286795780245)
rquotient_dd: 4.553061 (286795780245)
And the statistics from that are:
rquotient_su 10 6.129633 0.101331
rquotient_dd 10 4.588058 0.072669
rquotient_uu 10 6.148676 0.104937
rquotient_ss 10 6.103104 0.057498
It doesn't take a statistical genius to spot that this shows the -ffast-math
floating-point alternative is now better than the integer version — by a similar factor to how integer was better than floating-point without the extra compiler option.
One more set of statistics with -ffast-math
. These show smaller variances (standard deviations), but the same overall result.
rquotient_su 10 6.060705 0.024372
rquotient_dd 10 4.543576 0.014742
rquotient_uu 10 6.057718 0.026419
rquotient_ss 10 6.061652 0.034652
For 32-bit integers, it would appear that with -ffast-math
, the code using double
can be faster than the code using only integers.
If the range was changed from 32-bit integers to 64-bit integers, then 64-bit doubles would not be able to represent all integer values exactly. At that point, if the numbers being divided are large enough, you could start finding accuracy errors (the accumulator results might well be different). A 64-bit double effectively has 53 bits to represent the mantissa, so if the number of bits in the integers were larger than that, accuracy drops.
Indeed, it might be safer to say "Your Milage WILL Vary".
Upvotes: 3
Reputation: 35149
If you know x
and y
both to be positive:
int rquotient_uu(unsigned int x, unsigned int y) {
return (x + y/2) / y;
}
If you know y
to be positive:
int rquotient_su(int x, unsigned int y) {
if (x > 0) {
return (x + y/2) / y;
} else {
return (x - y/2) / y;
}
}
If both are signed:
int rquotient_ss(int x, int y) {
if ((x ^ y) >= 0) { // beware of operator precedence
return (x + y/2) / y; // signs match, positive quotient
} else {
return (x - y/2) / y; // signs differ, negative quotient
}
}
And if you really want to baffle your future self or are addicted to code golf, please resist the urge to write it this way: ;)
int rquotient_ss(int x, int y) {
return (x + (((x^y)>=0)?y:-y)/2)/y;
}
Upvotes: 6
Reputation: 29945
A simple solution would be to use round
and double
:
#include <math.h>
int rquotient(int const x, int const y) {
return (int)round((double)x / y);
}
Upvotes: 2