Reputation: 2812

performance of log10 function returning an int

Today I needed a cheap log10() function, of which I only used the int part. Assuming the result is floored, so the log10() of 999 would be 2. Would it be beneficial writing a function myself? And if so, which way would be the best to go. Assuming the code would not be optimized.

The alternatives to log10() I've though of;

use a for loop dividing or multiplying by 10;
use a string parser(probably extremely expensive);
using an integer log2() function multiplying by a constant.

Upvotes: 15

Answers (7)

pan

Reputation: 173

Space-optimized version of StephenCanon's answer, uint32_t only. Uses a single LUT of 16 bytes and properly handles 0.

#include <limits.h>
#include <stdint.h>

#ifdef _MSC_VER
#include <intrin.h>
#pragma intrinsic(_BitScanReverse)
#endif

static inline unsigned msb(const unsigned x)
{
    unsigned lz;
#ifdef _MSC_VER
    (void) _BitScanReverse(&lz, x);
#else
    lz = __builtin_clz(x);
#endif
    return ((sizeof(x) * CHAR_BIT) - 1) - lz;
}

static unsigned b10digits(const uint32_t x)
{
    static const uint16_t lut[8] = { 0, 2, 13, 69, 347, 1736, 8680, 43402 };
    const unsigned
        shr = (msb(x | 1) & ((sizeof(x) * CHAR_BIT) - 2)) << 1,
        dig = (0x8877665433221100ull >> shr) & 0xF,
        val = lut[(0x07D63440u >> (dig * 3)) & 0x7],
        p10 = ((val << dig) + (~((unsigned)-1 << dig) & 0xC7)) * 90 + 10;
    return dig + (x >= p10 ? 2 : 1);
}

Upvotes: 1

Paolo Bonzini

Reputation: 1930

Because an unsigned < or >= test is done simply by subtracting and checking the carry flag, it is possible to put both arrays (guess and negated tenToThe) in a single 64-bit value, combine both array lookups into one, and use the carry from 32-bit addition to adjust the guess. The high 32 bits of guess[n] provide the value of log10(2^n*2-1), while the low 32 bits contain -10^log10(2^n*2-1).

static unsigned int baseTwoDigits(unsigned int x) {
    return x ? 32 - __builtin_clz(x) : 0;
}

unsigned int baseTenDigits(unsigned int x) {
    static uint64_t guess[33] = {
      /* 1 */         0, 0, 0, 
      /* 8 */         (1ull<<32)-10, (1ull<<32)-10, (1ull<<32)-10, 
      /* 64 */        (2ull<<32)-100, (2ull<<32)-100, (2ull<<32)-100, 
      /* 512 */       (3ull<<32)-1000, (3ull<<32)-1000, (3ull<<32)-1000, 
                      (3ull<<32)-1000, 
      /* 8192 */      (4ull<<32)-10000, (4ull<<32)-10000, (4ull<<32)-10000, 
      /* 65536 */     (5ull<<32)-100000, (5ull<<32)-100000, (5ull<<32)-100000, 
      /* 524288 */    (6ull<<32)-1000000, (6ull<<32)-1000000, (6ull<<32)-1000000, 
                      (6ull<<32)-1000000, 
      /* 8388608 */   (7ull<<32)-10000000, (7ull<<32)-10000000,
                      (7ull<<32)-10000000, 
      /* 67108864 */  (8ull<<32)-100000000, (8ull<<32)-100000000, 
                      (8ull<<32)-100000000, 
      /* 536870912 */ (9ull<<32)-1000000000, (9ull<<32)-1000000000, 
                      (9ull<<32)-1000000000, 
    };
    uint64_t adjust = guess[baseTwoDigits(x)];
    return (adjust + x) >> 32;
}

Upvotes: 1

Stephen Canon

Reputation: 106117

The operation can be done in (fast) constant time on any architecture that has a count-leading-zeros or similar instruction (which is most architectures). Here's a C snippet I have sitting around to compute the number of digits in base ten, which is essentially the same task (assumes a gcc-like compiler and 32-bit int):

unsigned int baseTwoDigits(unsigned int x) {
    return x ? 32 - __builtin_clz(x) : 0;
}

static unsigned int baseTenDigits(unsigned int x) {
    static const unsigned char guess[33] = {
        0, 0, 0, 0, 1, 1, 1, 2, 2, 2,
        3, 3, 3, 3, 4, 4, 4, 5, 5, 5,
        6, 6, 6, 6, 7, 7, 7, 8, 8, 8,
        9, 9, 9
    };
    static const unsigned int tenToThe[] = {
        1, 10, 100, 1000, 10000, 100000, 
        1000000, 10000000, 100000000, 1000000000,
    };
    unsigned int digits = guess[baseTwoDigits(x)];
    return digits + (x >= tenToThe[digits]);
}

GCC and clang compile this down to ~10 instructions on x86. With care, one can make it faster still in assembly.

The key insight is to use the (extremely cheap) base-two logarithm to get a fast estimate of the base-ten logarithm; at that point we only need to compare against a single power of ten to decide if we need to adjust the guess. This is much more efficient than searching through multiple powers of ten to find the right one.

If the inputs are overwhelmingly biased to one- and two-digit numbers, a linear scan is sometimes faster; for all other input distributions, this implementation tends to win quite handily.

Upvotes: 42

Mike Dunlavey

Reputation: 40649

Well, there's the old standby - the "poor man's log function". (If you want to handle more than 63 integer digits, change the first "if" to a "while".)

n = 1;
if (v >= 1e32){n += 32; v /= 1e32;}
if (v >= 1e16){n += 16; v /= 1e16;}
if (v >=  1e8){n +=  8; v /=  1e8;}
if (v >=  1e4){n +=  4; v /=  1e4;}
if (v >=  1e2){n +=  2; v /=  1e2;}
if (v >=  1e1){n +=  1; v /=  1e1;}

so if you feed in 123456.7, here's how it goes:

n = 1;
if (v >= 1e32) no
if (v >= 1e16) no
if (v >=  1e8) no
if (v >=  1e4) yes, so n = 5, v = 12.34567
if (v >=  1e2) no
if (v >=  1e1) yes, so n = 6, v = 1.234567
     so result is n = 6

Here's a variation that uses multiplication, rather than division:

int n = 1;
double d = 1, temp;
temp = d * 1e32; if (v >= temp){n += 32; d = temp;}
temp = d * 1e16; if (v >= temp){n += 16; d = temp;}
temp = d *  1e8; if (v >= temp){n +=  8; d = temp;}
temp = d *  1e4; if (v >= temp){n +=  4; d = temp;}
temp = d *  1e2; if (v >= temp){n +=  2; d = temp;}
temp = d *  1e1; if (v >= temp){n +=  1; d = temp;}

and an execution looks like this

v = 123456.7
n = 1
d = 1
temp = 1e32, if (v >= 1e32) no
temp = 1e16, if (v >= 1e16) no
temp =  1e8, if (v >=  1e8) no
temp =  1e4, if (v >=  1e4) yes, so n = 5, d = 1e4;
temp =  1e6, if (v >=  1e6) no
temp =  1e5, if (v >=  1e5) yes, so n = 6, d = 1e5;

Upvotes: 3

Dennis Jaheruddin

Reputation: 21563

Without any specifications, I will just give a general answer:

The log function will be pretty efficient in most languages as it is such a basic function.

The fact that you are only interested in integers could give you some leverage, but probably this is not enough to easily beat the builtin standard solutions.

One of the few things that I can think of to be faster than a builtin function is a table lookup, so if you are only interested in the numbers upto 10000 for instance, you could simply create a table that you could use to lookup any of these values when you need them.

Obviously this solution will not scale well, but it may be just what you need.

Sidenote: If you are importing the data for example, it may actually be faster to look at the string diecty length (rather than first converting the string to a number and than looking at the value of the string). Of course this will require the input to be stored in just the right format, otherwise it won't gain you anything.

Upvotes: -1

Slava Bacherikov

Reputation: 2066

One way to do it would be loop with subtracting powers of 10. This powers could be computed and stored in table. Here example in python:

table = [10**i for i in range(1, 10)]
# [10, 100, 1000, 10000, 100000, 1000000, 10000000, 100000000, 1000000000]

def fast_log10(n):
    for i, k in enumerate(table):
        if n - k < 0:
           return i

Usage example:

>>> fast_log10(1)
0
>>> fast_log10(10)
1
>>> fast_log10(100)
2
>>> fast_log10(999)
2
fast_log10(1000)
3

Also you may use binary search with this table. Then algorithm complexity would be only O(lg(n)), where n is number of digits. Here example with binary search in C:

long int table[] = {10, 100, 1000, 10000, 1000000};
#define TABLE_LENGHT sizeof(table) / sizeof(long int)

int bisect_log10(long int n, int s, int e) {
    int a = (e - s) / 2 + s;
    if(s >= e)
        return s;
    if((table[a] - n) <= 0)
        return bisect_log10(n, a + 1, e);
    else
        return bisect_log10(n, s, a);
}

int fast_log10(long int n){
    return bisect_log10(n, 0, TABLE_LENGHT);
}

Note for small numbers this method would slower then upper method. Full code here.

Upvotes: 2

Westranger

Reputation: 1367

If you want to have a faster log function you need to approximate their result. E.g. the exp function can be approximated using a 'short' taylor approximation. You can find example approximations for exp, log, root and power here

edit: You can find a short performance comparsion here

Upvotes: 1

performance of log10 function returning an int

Answers (7)

Related Questions