Reputation: 59287
Though the representations of a number are somewhat a relative aspect, we generally use the decimal form when printing to the outer world.
I'm in a Mac OS X, and while analyzing the source of libc I found that the
famous printf
function ends up calling a little function __ultoa
— after
going through vfprintf_l
, the 1104 lines __vfprintf
and finally __ultoa
.
It's defined as follows (in this case all these come straight from FreeBSD):
/*
* Convert an unsigned long to ASCII for printf purposes, returning
* a pointer to the first character of the string representation.
* Octal numbers can be forced to have a leading zero; hex numbers
* use the given digits.
*/
static CHAR *
__ultoa(u_long val, CHAR *endp, int base, int octzero, const char *xdigs)
{
CHAR *cp = endp;
long sval;
/*
* Handle the three cases separately, in the hope of getting
* better/faster code.
*/
switch (base) {
case 10:
if (val < 10) { /* many numbers are 1 digit */
*--cp = to_char(val);
return (cp);
}
/*
* On many machines, unsigned arithmetic is harder than
* signed arithmetic, so we do at most one unsigned mod and
* divide; this is sufficient to reduce the range of
* the incoming value to where signed arithmetic works.
*/
if (val > LONG_MAX) {
*--cp = to_char(val % 10);
sval = val / 10;
} else
sval = val;
do {
*--cp = to_char(sval % 10);
sval /= 10;
} while (sval != 0);
break;
case 8:
do {
*--cp = to_char(val & 7);
val >>= 3;
} while (val);
if (octzero && *cp != '0')
*--cp = '0';
break;
case 16:
do {
*--cp = xdigs[val & 15];
val >>= 4;
} while (val);
break;
default: /* oops */
LIBC_ABORT("__ultoa: invalid base=%d", base);
}
return (cp);
}
Here CHAR
is just typedef'ed to char
(for some reason) and to_char
does
basically what you'd expect:
#define to_char(n) ((n) + '0')
The translation to the decimal form occurs in a straightforward way, dividing by 10 and taking %10:
do {
*--cp = to_char(sval % 10);
sval /= 10;
} while (sval != 0);
However, while this work for little numbers (up to 8 bytes) it seems too much "manual labor" for me. In GMP, you can easily calculate 25000:
mpz_t n;
mpz_init(n);
mpz_ui_pow_ui(n, 2ul, 5000ul);
gmp_printf("%Zd\n", n);
While this has an easy representation for bases 2 or 16, the decimal form is a bit harder to calculate.
So, how exactly libraries like GMP handles these? Looks like taking modulo and divisions can be expensive for such big numbers. Is there any faster algorithm, or am I wrong and the standard process is easy for computers?
Upvotes: 1
Views: 658
Reputation: 215257
The standard process is not easy, but one way or another you need to do the equivalent operations to obtain the decimal digits, and this can involve high-precision arithmetic even if the original value in binary is just a few bits or a single bit. See my question:
How do you print the EXACT value of a floating point number?
It's about floating point, but all large floating point numbers are integers anyway, and the very-large and very-small cases are the only interesting cases.
Upvotes: 3