Reputation: 1155
If I want to find the sum of the digits of a number, i.e.:
932
14
, which is (9 + 3 + 2)
What is the fastest way of doing this?
I instinctively did:
sum(int(digit) for digit in str(number))
and I found this online:
sum(map(int, str(number)))
Which is best to use for speed, and are there any other methods which are even faster?
Upvotes: 110
Views: 299464
Reputation: 363324
Whether it's faster to work with math or strings here depends on the size of the input number.
For small numbers (fewer than 30 digits in length), use division and modulus:
def sum_digits_math(n):
r = 0
while n:
r, n = r + n % 10, n // 10
return r
For large numbers (greater than 30 digits in length), use the string domain:
def sum_digits_str_fast(n):
d = str(n)
return sum(int(s) * d.count(s) for s in "123456789")
The performance profile using math (blue line on the graph below) scales poorly as the input number is bigger, but using the string domain appears to scale linearly in the length of the input.
The code that was used to generate this graph may be found here. I'm using CPython 3.13.1 on macOS.
Upvotes: 1
Reputation: 2895
Why is the highest rated answer 3.70x slower than this ?
% echo; ( time (nice echo 33785139853861968123689586196851968365819658395186596815968159826259681256852169852986 \
| mawk2 'gsub(//,($_)($_)($_))+gsub(//,($_))+1' | pvE0 \
| mawk2 '
function __(_,___,____,_____) {
____=gsub("[^1-9]+","",_)~""
___=10
while((+____<--___) && _) {
_____+=___*gsub(___,"",_)
}
return _____+length(_) }
BEGIN { FS=OFS=ORS
RS="^$"
} END {
print __($!_) }' )| pvE9 ) | gcat -n | lgp3 ;
in0: 173MiB 0:00:00 [1.69GiB/s] [1.69GiB/s] [<=> ]
out9: 11.0 B 0:00:09 [1.15 B/s] [1.15 B/s] [<=> ]
in0: 484MiB 0:00:00 [2.29GiB/s] [2.29GiB/s] [ <=> ]
( nice echo | mawk2 'gsub(//,($_)($_)($_))+gsub(//,($_))+1' | pvE 0.1 in0 | )
8.52s user 1.10s system 100% cpu 9.576 total
1 2822068024
% echo; ( time ( nice echo 33785139853861968123689586196851968365819658395186596815968159826259681256852169852986 \
\
| mawk2 'gsub(//,($_)($_)($_))+gsub(//,($_))+1' | pvE0 \
| gtr -d '\n' \
\
| python3 -c 'import math, os, sys;
[ print(sum(int(digit) for digit in str(ln)), \
end="\n") \
\
for ln in sys.stdin ]' )| pvE9 ) | gcat -n | lgp3 ;
in0: 484MiB 0:00:00 [ 958MiB/s] [ 958MiB/s] [ <=> ]
out9: 11.0 B 0:00:35 [ 317miB/s] [ 317miB/s] [<=> ]
( nice echo | mawk2 'gsub(//,($_)($_)($_))+gsub(//,($_))+1' | pvE 0.1 in0 | )
35.22s user 0.62s system 101% cpu 35.447 total
1 2822068024
And that's being a bit generous already. On this large synthetically created test case of 2.82 GB, it's 19.2x slower.
% echo; ( time ( pvE0 < testcases_more108.txt | mawk2 'function __(_,___,____,_____) { ____=gsub("[^1-9]+","",_)~"";___=10; while((+____<--___) && _) { _____+=___*gsub(___,"",_) }; return _____+length(_) } BEGIN { FS=RS="^$"; CONVFMT=OFMT="%.20g" } END { print __($_) }' ) | pvE9 ) |gcat -n | ggXy3 | lgp3;
in0: 284MiB 0:00:00 [2.77GiB/s] [2.77GiB/s] [=> ] 9% ETA 0:00:00
out9: 11.0 B 0:00:11 [1016miB/s] [1016miB/s] [<=> ]
in0: 2.82GiB 0:00:00 [2.93GiB/s] [2.93GiB/s] [=============================>] 100%
( pvE 0.1 in0 < testcases_more108.txt | mawk2 ; )
8.75s user 2.36s system 100% cpu 11.100 total
1 3031397722
% echo; ( time ( pvE0 < testcases_more108.txt | gtr -d '\n' | python3 -c 'import sys; [ print(sum(int(_) for _ in str(__))) for __ in sys.stdin ]' ) | pvE9 ) |gcat -n | ggXy3 | lgp3;
in0: 2.82GiB 0:00:02 [1.03GiB/s] [1.03GiB/s] [=============================>] 100%
out9: 11.0 B 0:03:32 [53.0miB/s] [53.0miB/s] [<=> ]
( pvE 0.1 in0 < testcases_more108.txt | gtr -d '\n' | python3 -c ; )
211.47s user 3.02s system 100% cpu 3:32.69 total
1 3031397722
—————————————————————
UPDATE : native python3 code of that concept - even with my horrific python skills, i'm seeing a 4x speedup :
% echo; ( time ( pvE0 < testcases_more108.txt \
\
|python3 -c 'import re, sys;
print(sum([ sum(int(_)*re.subn(_,"",__)[1]
for _ in [r"1",r"2", r"3",r"4",
r"5",r"6",r"7",r"8",r"9"])
for __ in sys.stdin ]))' |pvE9))|gcat -n| ggXy3|lgp3
in0: 1.88MiB 0:00:00 [18.4MiB/s] [18.4MiB/s] [> ] 0% ETA 0:00:00
out9: 0.00 B 0:00:51 [0.00 B/s] [0.00 B/s] [<=> ]
in0: 2.82GiB 0:00:51 [56.6MiB/s] [56.6MiB/s] [=============================>] 100%
out9: 11.0 B 0:00:51 [ 219miB/s] [ 219miB/s] [<=> ]
( pvE 0.1 in0 < testcases_more108.txt | python3 -c | pvE 0.1 out9; )
48.07s user 3.57s system 100% cpu 51.278 total
1 3031397722
Even the smaller test case managed a 1.42x speed up :
echo; ( time (nice echo 33785139853861968123689586196851968365819658395186596815968159826259681256852169852986 \
| mawk2 'gsub(//,($_)($_)$_)+gsub(//,$_)+1' ORS='' | pvE0 | python3 -c 'import re, sys; print(sum([ sum(int(_)*re.subn(_,"",__)[1] for _ in [r"1",r"2", r"3",r"4",r"5",r"6",r"7",r"8",r"9"]) for __ in sys.stdin ]))' | pvE9 )) |gcat -n | ggXy3 | lgp3
in0: 484MiB 0:00:00 [2.02GiB/s] [2.02GiB/s] [ <=> ]
out9: 11.0 B 0:00:24 [ 451miB/s] [ 451miB/s] [<=> ]
( nice echo | mawk2 'gsub(//,($_)($_)$_)+gsub(//,$_)+1' ORS='' | pvE 0.1 in0)
20.04s user 5.10s system 100% cpu 24.988 total
1 2822068024
Upvotes: -4
Reputation: 62948
Both lines you posted are fine, but you can do it purely in integers, and it will be the most efficient:
def sum_digits(n):
s = 0
while n:
s += n % 10
n //= 10
return s
or with divmod
:
def sum_digits2(n):
s = 0
while n:
n, remainder = divmod(n, 10)
s += remainder
return s
Slightly faster is using a single assignment statement:
def sum_digits3(n):
r = 0
while n:
r, n = r + n % 10, n // 10
return r
> %timeit sum_digits(n)
1000000 loops, best of 3: 574 ns per loop
> %timeit sum_digits2(n)
1000000 loops, best of 3: 716 ns per loop
> %timeit sum_digits3(n)
1000000 loops, best of 3: 479 ns per loop
> %timeit sum(map(int, str(n)))
1000000 loops, best of 3: 1.42 us per loop
> %timeit sum([int(digit) for digit in str(n)])
100000 loops, best of 3: 1.52 us per loop
> %timeit sum(int(digit) for digit in str(n))
100000 loops, best of 3: 2.04 us per loop
Upvotes: 139
Reputation: 45
Doing some Codecademy challenges I resolved this like:
def digit_sum(n):
digits = []
nstr = str(n)
for x in nstr:
digits.append(int(x))
return sum(digits)
Upvotes: 2
Reputation: 55913
A base 10 number can be expressed as a series of the form
a × 10^p + b × 10^p-1 .. z × 10^0
so the sum of a number's digits is the sum of the coefficients of the terms.
Based on this information, the sum of the digits can be computed like this:
import math
def add_digits(n):
# Assume n >= 0, else we should take abs(n)
if 0 <= n < 10:
return n
r = 0
ndigits = int(math.log10(n))
for p in range(ndigits, -1, -1):
d, n = divmod(n, 10 ** p)
r += d
return r
This is effectively the reverse of the continuous division by 10 in the accepted answer. Given the extra computation in this function compared to the accepted answer, it's not surprising to find that this approach performs poorly in comparison: it's about 3.5 times slower, and about twice as slow as
sum(int(x) for x in str(n))
Upvotes: 0
Reputation: 183
Found this on one of the problem solving challenge websites. Not mine, but it works.
num = 0 # replace 0 with whatever number you want to sum up
print(sum([int(k) for k in str(num)]))
Upvotes: 8
Reputation: 9
Here is a solution without any loop or recursion but works for non-negative integers only (Python3):
def sum_digits(n):
if n > 0:
s = (n-1) // 9
return n-9*s
return 0
Upvotes: 0
Reputation: 87
Try this
print(sum(list(map(int,input("Enter your number ")))))
Upvotes: 0
Reputation: 1
you can also try this with built_in_function called divmod() ;
number = int(input('enter any integer: = '))
sum = 0
while number!=0:
take = divmod(number, 10)
dig = take[1]
sum += dig
number = take[0]
print(sum)
you can take any number of digit
Upvotes: -2
Reputation: 3653
If you want to keep summing the digits until you get a single-digit number (one of my favorite characteristics of numbers divisible by 9) you can do:
def digital_root(n):
x = sum(int(digit) for digit in str(n))
if x < 10:
return x
else:
return digital_root(x)
Which actually turns out to be pretty fast itself...
%timeit digital_root(12312658419614961365)
10000 loops, best of 3: 22.6 µs per loop
Upvotes: 16
Reputation: 79
This might help
def digit_sum(n):
num_str = str(n)
sum = 0
for i in range(0, len(num_str)):
sum += int(num_str[i])
return sum
Upvotes: 7