Reputation: 3797
Do there exist intrinsics for ARM C compilers to do add-with-carry operations, or is it necessary to use assembly language?
On x86, there is _addcarry_u64
for add-with-carry. (There's also the newer _addcarryx_u64
for special purposes.)
Upvotes: 6
Views: 1986
Reputation: 22395
There is no intrinsic with current versions of gcc (gcc5 was released the year this question was asked). An issue is that communication of the 'carry flag'. However, the ARM backend does know and define a set of ADC
primitives such as addsi3_carryin.
For example,
unsigned long long big_inc(unsigned long long x)
{
return ++x;
}
Is translated to,
big_inc(unsigned long long):
@ args = 0, pretend = 0, frame = 0
@ frame_needed = 0, uses_anonymous_args = 0
@ link register save eliminated.
adds r0, r0, #1
adc r1, r1, #0
bx lr
It is always instructive to look at open source multi-precision libraries when you have a question like this. There is OpenSSL bignum and GNU MP libraries without any research. As the intrinsic doesn't exist a more definitive answer (for your work) depends on exactly what it is you want to achieve; prime factors, multiply, add, etc. You can always use assembler or more powerfully use a script that generates assembler for your particular integer length.
Upvotes: 8
Reputation: 570
From old documentation (as old as gcc 5 !!!!)
https://gcc.gnu.org/onlinedocs/gcc/Integer-Overflow-Builtins.html https://gcc.gnu.org/onlinedocs/gcc-5.3.0/gcc/Integer-Overflow-Builtins.html
Both clang and GCC do implement these builtins, and I verified the generated code is optimal on both x86_64 and aarch64 targets
#include <stdint.h>
typedef unsigned __int128 uint128_t;
// carry_out = a + b + carry_in
uint8_t my_addcarry_u64(uint8_t carry_in, uint64_t a, uint64_t b, uint64_t * sum)
{
bool c;
uint64_t res;
c = __builtin_uaddll_overflow (a, b, (long long unsigned *)&res);
c |= __builtin_uaddll_overflow (res, carry_in, (long long unsigned *)&res);
*sum = res;
return c;
}
// carry_out = a + b + carry_in
uint8_t my_addcarry_u128(uint8_t carry_in, uint128_t a, uint128_t b, uint128_t * sum)
{
bool c;
uint64_t res_lo, res_hi;
c = __builtin_uaddll_overflow (a, b, (long long unsigned *)&res_lo);
c |= __builtin_uaddll_overflow (carry_in, res_lo, (long long unsigned *)&res_lo);
c = __builtin_uaddll_overflow (a >> 64, c, (long long unsigned *)&res_hi);
c |= __builtin_uaddll_overflow (b >> 64, res_hi, (long long unsigned *)&res_hi);
*sum = ((uint128_t)res_hi << 64) + res_lo;
return c;
}
Even if the original post is old, I provide a solution to the original question, in case someone reads this thread again
Upvotes: 3