SpFW
SpFW

Reputation: 1155

Sum the digits of a number

If I want to find the sum of the digits of a number, i.e.:

What is the fastest way of doing this?

I instinctively did:

sum(int(digit) for digit in str(number))

and I found this online:

sum(map(int, str(number)))

Which is best to use for speed, and are there any other methods which are even faster?

Upvotes: 110

Views: 299464

Answers (11)

wim
wim

Reputation: 363324

Whether it's faster to work with math or strings here depends on the size of the input number.

For small numbers (fewer than 30 digits in length), use division and modulus:

def sum_digits_math(n):
    r = 0
    while n:
        r, n = r + n % 10, n // 10
    return r

For large numbers (greater than 30 digits in length), use the string domain:

def sum_digits_str_fast(n):
    d = str(n)
    return sum(int(s) * d.count(s) for s in "123456789")

The performance profile using math (blue line on the graph below) scales poorly as the input number is bigger, but using the string domain appears to scale linearly in the length of the input.

profile

The code that was used to generate this graph may be found here. I'm using CPython 3.13.1 on macOS.

Upvotes: 1

RARE Kpop Manifesto
RARE Kpop Manifesto

Reputation: 2895

Why is the highest rated answer 3.70x slower than this ?

% echo; ( time (nice echo 33785139853861968123689586196851968365819658395186596815968159826259681256852169852986 \
 | mawk2 'gsub(//,($_)($_)($_))+gsub(//,($_))+1' | pvE0 \
 | mawk2 '
   
   function __(_,___,____,_____) {

                  ____=gsub("[^1-9]+","",_)~""
                ___=10
       while((+____<--___) && _) {
            _____+=___*gsub(___,"",_) 
       }
       return _____+length(_) } 

    BEGIN { FS=OFS=ORS
                RS="^$" 
    } END { 
            print __($!_) }' )| pvE9 ) | gcat -n | lgp3 ;

      in0:  173MiB 0:00:00 [1.69GiB/s] [1.69GiB/s] [<=>                                            ]
     out9: 11.0 B 0:00:09 [1.15 B/s] [1.15 B/s] [<=>                                               ]
      in0:  484MiB 0:00:00 [2.29GiB/s] [2.29GiB/s] [  <=>                                          ]
( nice echo  | mawk2 'gsub(//,($_)($_)($_))+gsub(//,($_))+1' | pvE 0.1 in0 | )  

 8.52s user 1.10s system 100% cpu 9.576 total
 1  2822068024



% echo; ( time ( nice echo 33785139853861968123689586196851968365819658395186596815968159826259681256852169852986 \
     \
     | mawk2 'gsub(//,($_)($_)($_))+gsub(//,($_))+1' | pvE0 \
     |  gtr -d '\n' \
     \
     |  python3 -c 'import math, os, sys;

        [ print(sum(int(digit) for digit in str(ln)), \
                                            end="\n") \
         \
         for ln in sys.stdin ]' )| pvE9 ) | gcat -n | lgp3 ;


      in0:  484MiB 0:00:00 [ 958MiB/s] [ 958MiB/s] [     <=>                                       ]
     out9: 11.0 B 0:00:35 [ 317miB/s] [ 317miB/s] [<=>                                             ]
( nice echo  | mawk2 'gsub(//,($_)($_)($_))+gsub(//,($_))+1' | pvE 0.1 in0 | )  

 35.22s user 0.62s system 101% cpu 35.447 total
     1  2822068024

And that's being a bit generous already. On this large synthetically created test case of 2.82 GB, it's 19.2x slower.

 % echo; ( time ( pvE0 < testcases_more108.txt  |  mawk2 'function __(_,___,____,_____) { ____=gsub("[^1-9]+","",_)~"";___=10; while((+____<--___) && _) { _____+=___*gsub(___,"",_) }; return _____+length(_) } BEGIN { FS=RS="^$"; CONVFMT=OFMT="%.20g" } END { print __($_) }'  ) | pvE9 ) |gcat -n | ggXy3  | lgp3; 

      in0:  284MiB 0:00:00 [2.77GiB/s] [2.77GiB/s] [=>                             ]  9% ETA 0:00:00
     out9: 11.0 B 0:00:11 [1016miB/s] [1016miB/s] [<=>                                             ]
      in0: 2.82GiB 0:00:00 [2.93GiB/s] [2.93GiB/s] [=============================>] 100%            
( pvE 0.1 in0 < testcases_more108.txt | mawk2 ; )

  8.75s user 2.36s system 100% cpu 11.100 total
     1  3031397722

% echo; ( time ( pvE0 < testcases_more108.txt  | gtr -d '\n' |  python3 -c 'import sys; [ print(sum(int(_) for _ in str(__))) for __ in sys.stdin ]' ) | pvE9 ) |gcat -n | ggXy3  | lgp3;  


      in0: 2.82GiB 0:00:02 [1.03GiB/s] [1.03GiB/s] [=============================>] 100%            
     out9: 11.0 B 0:03:32 [53.0miB/s] [53.0miB/s] [<=>                                             ]
( pvE 0.1 in0 < testcases_more108.txt | gtr -d '\n' | python3 -c ; )  

  211.47s user 3.02s system 100% cpu 3:32.69 total
     1  3031397722

—————————————————————

UPDATE : native python3 code of that concept - even with my horrific python skills, i'm seeing a 4x speedup :

% echo; ( time ( pvE0 < testcases_more108.txt  \
\
 |python3 -c 'import re, sys;

  print(sum([ sum(int(_)*re.subn(_,"",__)[1] 

     for _ in [r"1",r"2", r"3",r"4",
               r"5",r"6",r"7",r"8",r"9"])

 for __ in sys.stdin ]))' |pvE9))|gcat -n| ggXy3|lgp3   

      in0: 1.88MiB 0:00:00 [18.4MiB/s] [18.4MiB/s] [>                              ]  0% ETA 0:00:00
     out9: 0.00 B 0:00:51 [0.00 B/s] [0.00 B/s] [<=>                                               ]
      in0: 2.82GiB 0:00:51 [56.6MiB/s] [56.6MiB/s] [=============================>] 100%            
     out9: 11.0 B 0:00:51 [ 219miB/s] [ 219miB/s] [<=>                                             ]

( pvE 0.1 in0 < testcases_more108.txt | python3 -c  | pvE 0.1 out9; ) 



48.07s user 3.57s system 100% cpu 51.278 total
 1  3031397722

Even the smaller test case managed a 1.42x speed up :

 echo; ( time (nice echo 33785139853861968123689586196851968365819658395186596815968159826259681256852169852986 \ 
 | mawk2 'gsub(//,($_)($_)$_)+gsub(//,$_)+1' ORS='' | pvE0 | python3 -c 'import re, sys; print(sum([ sum(int(_)*re.subn(_,"",__)[1] for _ in [r"1",r"2", r"3",r"4",r"5",r"6",r"7",r"8",r"9"]) for __ in sys.stdin ]))'  | pvE9  ))  |gcat -n | ggXy3 | lgp3 


      in0:  484MiB 0:00:00 [2.02GiB/s] [2.02GiB/s] [  <=>                                          ]
     out9: 11.0 B 0:00:24 [ 451miB/s] [ 451miB/s] [<=>                                             ]
( nice echo  | mawk2 'gsub(//,($_)($_)$_)+gsub(//,$_)+1' ORS='' | pvE 0.1 in0)

 20.04s user 5.10s system 100% cpu 24.988 total
   1    2822068024

Upvotes: -4

Pavel Anossov
Pavel Anossov

Reputation: 62948

Both lines you posted are fine, but you can do it purely in integers, and it will be the most efficient:

def sum_digits(n):
    s = 0
    while n:
        s += n % 10
        n //= 10
    return s

or with divmod:

def sum_digits2(n):
    s = 0
    while n:
        n, remainder = divmod(n, 10)
        s += remainder
    return s

Slightly faster is using a single assignment statement:

def sum_digits3(n):
   r = 0
   while n:
       r, n = r + n % 10, n // 10
   return r

> %timeit sum_digits(n)
1000000 loops, best of 3: 574 ns per loop

> %timeit sum_digits2(n)
1000000 loops, best of 3: 716 ns per loop

> %timeit sum_digits3(n)
1000000 loops, best of 3: 479 ns per loop

> %timeit sum(map(int, str(n)))
1000000 loops, best of 3: 1.42 us per loop

> %timeit sum([int(digit) for digit in str(n)])
100000 loops, best of 3: 1.52 us per loop

> %timeit sum(int(digit) for digit in str(n))
100000 loops, best of 3: 2.04 us per loop

Upvotes: 139

Tsvetelin Tsvetkov
Tsvetelin Tsvetkov

Reputation: 45

Doing some Codecademy challenges I resolved this like:

def digit_sum(n):
    digits = []
    nstr = str(n)
    for x in nstr:
        digits.append(int(x))
    return sum(digits)

Upvotes: 2

snakecharmerb
snakecharmerb

Reputation: 55913

A base 10 number can be expressed as a series of the form

a × 10^p + b × 10^p-1 .. z × 10^0

so the sum of a number's digits is the sum of the coefficients of the terms.

Based on this information, the sum of the digits can be computed like this:

import math

def add_digits(n):
    # Assume n >= 0, else we should take abs(n)
    if 0 <= n < 10:
        return n
    r = 0
    ndigits = int(math.log10(n))
    for p in range(ndigits, -1, -1):
        d, n = divmod(n, 10 ** p)
        r += d
    return r

This is effectively the reverse of the continuous division by 10 in the accepted answer. Given the extra computation in this function compared to the accepted answer, it's not surprising to find that this approach performs poorly in comparison: it's about 3.5 times slower, and about twice as slow as

sum(int(x) for x in str(n))

Upvotes: 0

Ash
Ash

Reputation: 183

Found this on one of the problem solving challenge websites. Not mine, but it works.

num = 0            # replace 0 with whatever number you want to sum up
print(sum([int(k) for k in str(num)]))

Upvotes: 8

Khizar Anjum
Khizar Anjum

Reputation: 9

Here is a solution without any loop or recursion but works for non-negative integers only (Python3):

def sum_digits(n):
    if n > 0:
        s = (n-1) // 9    
        return n-9*s
    return 0

Upvotes: 0

praveen nani
praveen nani

Reputation: 87

Try this

    print(sum(list(map(int,input("Enter your number ")))))

Upvotes: 0

simran sinha
simran sinha

Reputation: 1

you can also try this with built_in_function called divmod() ;

number = int(input('enter any integer: = '))
sum = 0
while number!=0: 
    take = divmod(number, 10) 
    dig = take[1] 
    sum += dig 
    number = take[0] 
print(sum) 

you can take any number of digit

Upvotes: -2

d8aninja
d8aninja

Reputation: 3653

If you want to keep summing the digits until you get a single-digit number (one of my favorite characteristics of numbers divisible by 9) you can do:

def digital_root(n):
    x = sum(int(digit) for digit in str(n))
    if x < 10:
        return x
    else:
        return digital_root(x)

Which actually turns out to be pretty fast itself...

%timeit digital_root(12312658419614961365)

10000 loops, best of 3: 22.6 µs per loop

Upvotes: 16

Aftab Lateef
Aftab Lateef

Reputation: 79

This might help

def digit_sum(n):
    num_str = str(n)
    sum = 0
    for i in range(0, len(num_str)):
        sum += int(num_str[i])
    return sum

Upvotes: 7

Related Questions