Tom Heng
Tom Heng

Reputation: 35

the different efficiency for number digits counting with python & C

I try to write a function to count number digits,and by the way,I try to compare the efficiency of the different way. 1.the lenstr(i) way:

def nDigits(i):
    return len(str(i))
for i in range(100000):
    print nDigits(i)

it takes about 143.75s

2.the log10 way:

import math
def nDigits(i):
    if i > 0:
        n = int(math.log10(i)) + 1
    elif i == 0:
        n = 1
    else:
        n = int(math.log10(-i)) + 2
    return n

for i in range(100000):
    print nDigits(i)

it takes about 144.35s

3.the division way:

def nDigits(i):
    t = 0
    while i > 0:
        t += 1
        i /= 10
    return t
for i in range(100000):
    print nDigits(i)

it takes about 143.43s

4.the division way in c:

#include<stdio.h>

int digits(int num){
    int i = 0;
    while (num > 0){
        i += 1;
        num /= 10;
    }
    return i;
} 

void main(){
    int i = 0;
    while (i < 100000){
        i += 1;
        printf("%d",digits(i));    
    }
}

it takes about 0.07s

Is the C is 2000 times better than python...or There is a better way for python to counting number digits. thx guys,plz help me.

Upvotes: 1

Views: 357

Answers (3)

Blender
Blender

Reputation: 298432

Simplify your test cases and remove all of those prints:

import math

def num_digits1(n):
    return len(str(n))

def num_digits2(n):
    return int(math.log10(n)) + 1

def num_digits3(n):
    t = 0

    while n:
        t += 1
        n /= 10

    return t

Here are my results:

>>> %timeit num_digits1(random.randint(1, 100000000))
100000 loops, best of 3: 1.64 us per loop
>>> %timeit num_digits2(random.randint(1, 100000000))
100000 loops, best of 3: 1.87 us per loop
>>> %timeit num_digits3(random.randint(1, 100000000))
100000 loops, best of 3: 2.49 us per loop
>>> %timeit random.randint(1, 100000000)
1000000 loops, best of 3: 1.29 us per loop

Subtracting the time it takes to generate a random number, I get:

num_digits1  0.35 us
num_digits2  0.58 us
num_digits3  1.20 us

And my C code comparison (which I hope is fair):

#include <stdlib.h>

int rand_int(int min, int max) {
    return min + (rand() / (double) RAND_MAX) / (max - min);
}

int num_digits(int num) {
    int i = 0;

    while (num > 0){
        i += 1;
        num /= 10;
    }

    return i;
} 

int main() {
    int i;

    for (i = 0; i < 10000000; i++) {
        num_digits(rand_int(1, 100000000));
    }

    return 0;
}

I run it:

$ gcc test.c -o test
$ time ./test./test
0.15s user 0.00s system 97% cpu 0.154 total

And my time is:

  0.154 s / 10,000,000
= 0.0154 us (0.0138 us with -O3)

The C code is about 23 times faster than the Python solution, which seems normal. Hopefully my C random number generator works.

With PyPy, I get 66.7 ns (not us) for num_digits1, which is only 4.3 times slower.

Upvotes: 1

zzk
zzk

Reputation: 1387

That slow?? If you change for i in range(100000): to for i in xrange(100000):, it is much faster, at least on my computer ( 1 sec or 2 or 3).

I suspect the slowness is caused by your usage of range(100000)

xrange is more efficient because instead of generating a list of objects, it just generates one object at a time. You should favor it over range in this case.

EDIT: after @cge mentioned the problem, i tested your original code (using range(100000)) and it was also finished pretty fast, within a sec or two, so this may not be the cause of your problem (something fishy happened, which I couldn't see from the code you posted here), but I suggest you use xrange anyway.

Upvotes: 1

casevh
casevh

Reputation: 11424

I think your bottleneck is the print statement. Try saving the results in a list instead.

def nDigits(i):
    return len(str(i))
results = []
for i in xrange(1000000):
    results.append(nDigits(i))
print len(results)

I used xrange instead of range and added an extra 0. It executes in 0.45 seconds on my machine.

Using a list comprehension gets the time down to 0.37 seconds.

def nDigits(i):
    return len(str(i))
results = [nDigits(i) for i in xrange(1000000)]
print len(results)

Removing the function call overhead gets the time down to 0.31 seconds.

results = [len(str(i)) for i in xrange(1000000)]
print len(results)

Upvotes: 1

Related Questions