Myria
Myria

Reputation: 3797

Are there ARM intrinsics for add-with-carry in C?

Do there exist intrinsics for ARM C compilers to do add-with-carry operations, or is it necessary to use assembly language?

On x86, there is _addcarry_u64 for add-with-carry. (There's also the newer _addcarryx_u64 for special purposes.)

Upvotes: 6

Views: 1986

Answers (2)

artless-noise-bye-due2AI
artless-noise-bye-due2AI

Reputation: 22395

There is no intrinsic with current versions of gcc (gcc5 was released the year this question was asked). An issue is that communication of the 'carry flag'. However, the ARM backend does know and define a set of ADC primitives such as addsi3_carryin.

For example,

unsigned long long big_inc(unsigned long long x)
{
  return ++x;
}

Is translated to,

big_inc(unsigned long long):
        @ args = 0, pretend = 0, frame = 0
        @ frame_needed = 0, uses_anonymous_args = 0
        @ link register save eliminated.
        adds    r0, r0, #1
        adc     r1, r1, #0
        bx      lr

It is always instructive to look at open source multi-precision libraries when you have a question like this. There is OpenSSL bignum and GNU MP libraries without any research. As the intrinsic doesn't exist a more definitive answer (for your work) depends on exactly what it is you want to achieve; prime factors, multiply, add, etc. You can always use assembler or more powerfully use a script that generates assembler for your particular integer length.

Upvotes: 8

Pierre
Pierre

Reputation: 570

From old documentation (as old as gcc 5 !!!!)

https://gcc.gnu.org/onlinedocs/gcc/Integer-Overflow-Builtins.html https://gcc.gnu.org/onlinedocs/gcc-5.3.0/gcc/Integer-Overflow-Builtins.html

Both clang and GCC do implement these builtins, and I verified the generated code is optimal on both x86_64 and aarch64 targets

#include <stdint.h>

typedef unsigned __int128 uint128_t;


// carry_out = a + b + carry_in
uint8_t my_addcarry_u64(uint8_t carry_in, uint64_t a, uint64_t b, uint64_t * sum)
{
        bool c;
        uint64_t res;
        c = __builtin_uaddll_overflow (a, b, (long long unsigned *)&res);
        c |= __builtin_uaddll_overflow (res, carry_in, (long long unsigned *)&res);
        *sum = res;
        return c;
}

// carry_out = a + b + carry_in
uint8_t my_addcarry_u128(uint8_t carry_in, uint128_t a, uint128_t b, uint128_t * sum)
{
        bool c;
        uint64_t res_lo, res_hi;
        c = __builtin_uaddll_overflow (a, b, (long long unsigned *)&res_lo);
        c |= __builtin_uaddll_overflow (carry_in, res_lo, (long long unsigned *)&res_lo);
        c = __builtin_uaddll_overflow (a >> 64, c, (long long unsigned *)&res_hi);
        c |= __builtin_uaddll_overflow (b >> 64, res_hi, (long long unsigned *)&res_hi);
        *sum = ((uint128_t)res_hi << 64) + res_lo;
        return c;
}

Even if the original post is old, I provide a solution to the original question, in case someone reads this thread again

Upvotes: 3

Related Questions