Reputation: 49

Print 1 to N - Digits Count

I am working on a question in a coding website and got some problem in solving it. The question is:

Print 1 to N - Digits Count [ZOHO]

A positive integer N is passed as the input. If we print all the numbers from 1 to N continuosly, the program must find the number of characters(digits) C printed and print the count C as the output.

Input Format: The first line contains N.

Output Format: The first line contains C.

Boundary Conditions: 1 <= N <= 9999999

Example Input/Output 1: Input: 2

Output: 2

Explanation: We print 12 and hence the total number of characters is 2.

Example Input/Output 2: Input: 15

Output: 21

Explanation: We print 123456789101112131415 and hence the total number of characters is 21.

I solved the problem, but my code didn't pass all the test cases (they are hidden). This is my code:

n = int(input())
assert 1<=n<=9999999
a = [str(i) for i in range(1,n+1)]
b = ''.join(a)
print(len(b))

Is this is the right way to solve the problem or is there any other way which can be used to solve this problem?

Upvotes: 2

Answers (3)

Adirio

Reputation: 5286

I found two new methods, the first uses an approach similar to Miraj50's (I was developing it while he posted so I will keep it here for the sake of completeness) in a for loop fashion while the second one uses vector algebra to calculate it, it may be implemented in numpy also, which may improve its performance for big numbers. You can see them as methods a and b in the folloing section.

Profiling script

I did some timing with the following script. Modify INPUTS and REPETITIONS at will. It takes a while as it is executing REPETITIONS * len(INPUTS) times each method and len(INPUTS) times the OP's method to validate the results. It calculates the mean, median and standard deviation of the samples and print them in a nice table format. If you want to add a new method you just need to define the function and include it in the methods dict.

# Values to test
INPUTS = [7, 9, 53, 99, 999, 9999, 99999, 999999, 9999999]
# Number of repetitions
REPETITIONS = 30000

from timeit import default_timer as timer

def a(n): # Iterative method
    #n = input("Insert a number: ")
    start = timer()
    l = len(n)
    if l == 1: # Special case as int("9"*0) raises ValueError
        c = int(n)
    else:
        n, c = int(n), 9
        for i in range(2, n+1):
            if l == i:
                c += (n - int("9"*(i-1))) * i
                break
            c += int("9"+ "0"*(i-1)) * i
    print(c)
    end = timer()
    return c, end - start

def b(n): # No-loop method
    #n = input("Insert a number: ")
    start = timer()
    l = len(n)
    n = int(n)
    aux = [9*10**i for i in range(l-1)]
    aux.append(n - sum(aux))
    c = sum((v*(i+1) for i, v in enumerate(aux)))
    print(c)
    end = timer()
    return c, end - start

def c(n): # Miraj50's method
    #n = input("Insert a number: ")
    start = timer()
    l = len(n)
    x, c = 9, 0
    for i in range(1, l):
        c = c + x*i
        x = x * 10
    c = c + (int(n) - 10**(l-1) + 1) * l
    print(c)
    end = timer()
    return c, end - start

def d(n): # Pm2 Ring's method
    #n = input("Insert a number: ")
    start = timer()
    l = len(n)
    a = 10 ** (l - 1) 
    c = (int(n) + 1) * l - a - (a - 1) // 9
    print(c)
    end = timer()
    return c, end - start


# Method mapper
methods = {
           'Adirio 1': a,
           'Adirio 2': b,
           'Miraj50' : c,
           'PM2 Ring': d,
          }
methods_list = sorted(methods.keys())

# Calculate the results that are valid
results = []
for n in INPUTS:
    results.append(len(''.join(str(i) for i in range(1, n+1))))

# Calculate the results by the different methods
times = {}
for n, result in zip(INPUTS, results):
    t = {} # Store times
    for key in methods:
        aux1 = []
        for i in range(REPETITIONS):
            c, aux2 = methods[key](str(n))
            if c != result:
                raise ValueError("Method {} for n = {} yielded {} and {} was expected".format(key.upper(), n, c, result))
            aux1.append(aux2)
        aux1.sort()
        mean = sum(aux1) / REPETITIONS
        if REPETITIONS % 2 == 0:
            median = (aux1[REPETITIONS//2 - 1] + aux1[REPETITIONS//2]) / 2
        else:
            median = aux1[(REPETITIONS-1)//2]
        std_dev = (sum((x-mean)**2 for x in aux1)/(REPETITIONS-1))**.5
        t[key] = {
            'mean': mean,
            'median': median,
            'std_dev': std_dev,
            'min': aux1[0],
            'max': aux1[-1],
        }
    times[n] = t

# Print the results
for n, c in zip(INPUTS, results):
    print()
    print("N = {:<15}".format(n), end='')
    for method in methods_list:
        print("|{:^19}".format("Method {}".format(method)), end='')
    print()
    print("+".join(["-"*19]*(len(methods_list)+1)))
    for metric in times[INPUTS[0]][methods_list[0]]:
        print(" {:<12} [us] ".format(metric), end='')
        for method in methods_list:
            print("| {:>17.2f} ".format(times[n][method][metric] *10**6), end='')
        print()

Results

 N = 7             |  Method Adirio 1  |  Method Adirio 2  |  Method Miraj50   |  Method PM2 Ring  
-------------------+-------------------+-------------------+-------------------+-------------------
 median       [us] |             12.76 |             14.95 |             13.49 |             13.13 
 min          [us] |             10.21 |             12.03 |             10.94 |             10.57 
 mean         [us] |             85.15 |             87.11 |             83.93 |             85.68 
 max          [us] |          35319.77 |          37254.20 |          35428.07 |          35010.91 
 std_dev      [us] |           1486.21 |           1480.59 |           1465.55 |           1483.09 

 N = 9             |  Method Adirio 1  |  Method Adirio 2  |  Method Miraj50   |  Method PM2 Ring  
-------------------+-------------------+-------------------+-------------------+-------------------
 median       [us] |             12.76 |             14.95 |             13.49 |             13.13 
 min          [us] |             10.21 |             12.03 |             10.57 |             10.21 
 mean         [us] |             84.94 |             87.42 |             86.09 |             85.72 
 max          [us] |          34595.95 |          35712.49 |          36281.33 |          36368.85 
 std_dev      [us] |           1485.61 |           1486.80 |           1489.28 |           1482.85 

 N = 53            |  Method Adirio 1  |  Method Adirio 2  |  Method Miraj50   |  Method PM2 Ring  
-------------------+-------------------+-------------------+-------------------+-------------------
 median       [us] |             13.86 |             15.32 |             13.86 |             13.49 
 min          [us] |             10.94 |             12.03 |             10.94 |             10.94 
 mean         [us] |             85.93 |             88.38 |             85.41 |             85.91 
 max          [us] |          34776.45 |          38539.21 |          36581.07 |          36074.94 
 std_dev      [us] |           1477.20 |           1489.63 |           1477.41 |           1485.53 

 N = 99            |  Method Adirio 1  |  Method Adirio 2  |  Method Miraj50   |  Method PM2 Ring  
-------------------+-------------------+-------------------+-------------------+-------------------
 median       [us] |             14.22 |             15.32 |             13.86 |             13.49 
 min          [us] |             10.94 |             12.76 |             10.94 |             10.94 
 mean         [us] |             88.21 |             88.46 |             86.46 |             87.04 
 max          [us] |          40802.19 |          35949.14 |          36355.72 |          39536.51 
 std_dev      [us] |           1518.92 |           1489.06 |           1488.02 |           1525.52 

 N = 999           |  Method Adirio 1  |  Method Adirio 2  |  Method Miraj50   |  Method PM2 Ring  
-------------------+-------------------+-------------------+-------------------+-------------------
 median       [us] |             14.59 |             16.04 |             14.22 |             13.49 
 min          [us] |             12.03 |             13.13 |             11.30 |             10.57 
 mean         [us] |             87.68 |             88.48 |             86.96 |             85.81 
 max          [us] |          35570.64 |          36136.57 |          42908.73 |          36164.28 
 std_dev      [us] |           1488.79 |           1482.06 |           1489.39 |           1481.22 

 N = 9999          |  Method Adirio 1  |  Method Adirio 2  |  Method Miraj50   |  Method PM2 Ring  
-------------------+-------------------+-------------------+-------------------+-------------------
 median       [us] |             15.32 |             16.41 |             14.22 |             13.49 
 min          [us] |             12.40 |             13.49 |             11.67 |             10.57 
 mean         [us] |             88.06 |             88.91 |             86.78 |             85.58 
 max          [us] |          34688.20 |          34618.92 |          35010.55 |          40569.91 
 std_dev      [us] |           1478.79 |           1473.64 |           1495.85 |           1480.70 

 N = 99999         |  Method Adirio 1  |  Method Adirio 2  |  Method Miraj50   |  Method PM2 Ring  
-------------------+-------------------+-------------------+-------------------+-------------------
 median       [us] |             16.04 |             17.14 |             14.59 |             13.49 
 min          [us] |             13.13 |             14.22 |             11.67 |             10.94 
 mean         [us] |             89.44 |             89.79 |             88.20 |             86.94 
 max          [us] |          44273.23 |          42208.25 |          40763.54 |          37936.45 
 std_dev      [us] |           1497.36 |           1482.44 |           1512.71 |           1495.37 

 N = 999999        |  Method Adirio 1  |  Method Adirio 2  |  Method Miraj50   |  Method PM2 Ring  
-------------------+-------------------+-------------------+-------------------+-------------------
 median       [us] |             16.41 |             17.50 |             14.59 |             13.49 
 min          [us] |             13.86 |             14.59 |             12.03 |             10.94 
 mean         [us] |             88.60 |             90.63 |             87.18 |             85.68 
 max          [us] |          35199.80 |          36482.62 |          48330.99 |          51652.53 
 std_dev      [us] |           1475.87 |           1485.87 |           1483.34 |           1491.72 

 N = 9999999       |  Method Adirio 1  |  Method Adirio 2  |  Method Miraj50   |  Method PM2 Ring  
-------------------+-------------------+-------------------+-------------------+-------------------
 median       [us] |             17.14 |             17.87 |             14.95 |             13.86 
 min          [us] |             14.22 |             15.32 |             12.40 |             11.30 
 mean         [us] |             89.97 |             89.86 |             87.74 |             85.51 
 max          [us] |          35618.05 |          35605.65 |          36453.44 |          34081.07 
 std_dev      [us] |           1480.06 |           1467.89 |           1488.62 |           1475.57

Interpreting results

Of the 5 measures that are being tabulated you need to be aware of some implications. As the OS's scheduler may execute different processes in the middle of an algorithm, some of the timings have a quite big error, they take way too much time that what they really needed. This means that the upper bound given by the max, or the mean and standard deviation that consider all values in the sample, give results that are not really trustworthy. Both the minimum value and the median remove the influence of this really long cases in a different way. The comparisons of the algorithms based on the median and the minimum are very similar: all of the algorithms have a similar execution time being @PM2Ring 's version slightly faster.

Upvotes: 0

PM 2Ring

Reputation: 55469

Here's a version that uses essentially the same algorithm as Miraj50's answer, except it doesn't need a loop to calculate the length of numbers that have less digits than the current input number. I developed my formula with the help of this OEIS entry: A033713 "Number of zeros in numbers 1 to 999..9 (n digits)".

I'll add a slightly modified version of Miraj50's code, and the brute-force version given in the comments by Blender, to show that they all give the same results.

def nlen_blender(n):
    return sum(len(str(i)) for i in range(1,n+1))

def nlen_miraj(n):
    length = len(str(n))
    x, s = 9, 0
    for i in range(1,length):
        s = s + x*i
        x = x*10 
    return s + (n - 10**(length-1) + 1) * length

def nlen_pm2r(n):
    digits = len(str(n))
    a = 10 ** (digits - 1) 
    return (n + 1) * digits - a - (a - 1) // 9

# test

funcs = (nlen_blender, nlen_miraj, nlen_pm2r)
data = (0, 1, 37, 563, 4285, 12900, 375462)
for n in data:
    print(n)
    for f in funcs:
        print('{:13}: {}'.format(f.__name__, f(n)))
    print()

n = 12398765434324
print(n)
for f in (nlen_miraj, nlen_pm2r):
    print(f(n))

output

0
nlen_blender : 0
nlen_miraj   : 0
nlen_pm2r    : 0

1
nlen_blender : 1
nlen_miraj   : 1
nlen_pm2r    : 1

37
nlen_blender : 65
nlen_miraj   : 65
nlen_pm2r    : 65

563
nlen_blender : 1581
nlen_miraj   : 1581
nlen_pm2r    : 1581

4285
nlen_blender : 16033
nlen_miraj   : 16033
nlen_pm2r    : 16033

12900
nlen_blender : 53394
nlen_miraj   : 53394
nlen_pm2r    : 53394

375462
nlen_blender : 2141667
nlen_miraj   : 2141667
nlen_pm2r    : 2141667

12398765434324
162471604969439
162471604969439

Here's a version that does timeit tests. As you can see, eliminating that for loop speeds things up, especially when n is large.

from timeit import Timer

def nlen_miraj(n):
    length = len(str(n))
    x, s = 9, 0
    for i in range(1,length):
        s = s + x*i
        x = x*10 
    return s + (n - 10**(length-1) + 1) * length

def nlen_pm2r(n):
    digits = len(str(n))
    a = 10 ** (digits - 1) 
    return (n + 1) * digits - a - (a - 1) // 9

def time_test(num, loops):
    timings = []
    for func in funcs:
        t = Timer(lambda: func(num))
        result = sorted(t.repeat(3, loops))
        timings.append((result, func.__name__))
    timings.sort()
    for result, name in timings:
        print('{:13} : {}'.format(name, result))
    print()

funcs = (nlen_miraj, nlen_pm2r)
data = (0, 1, 37, 563, 4285, 12900, 375462, 12398765434324)

loops = 10000
for n in data:
    print(n)
    time_test(n, loops)

output

0
nlen_pm2r     : [0.03521797800203785, 0.0357760570004757, 0.035887997000827454]
nlen_miraj    : [0.0467547009975533, 0.04689033899921924, 0.04821185600303579]

1
nlen_pm2r     : [0.03615388999969582, 0.03690062500027125, 0.037922888001048705]
nlen_miraj    : [0.04778529000031995, 0.04816070699962438, 0.05409854399840697]

37
nlen_pm2r     : [0.04198674500003108, 0.04201827299766592, 0.04204300800120109]
nlen_miraj    : [0.06709170100293704, 0.067645346000063, 0.07428676299969084]

563
nlen_pm2r     : [0.04548632699879818, 0.045831783001631266, 0.04651351099892054]
nlen_miraj    : [0.0761348430023645, 0.07634904800215736, 0.08261940699958359]

4285
nlen_pm2r     : [0.04845972700059065, 0.048681981999834534, 0.04915663500287337]
nlen_miraj    : [0.08794620700064115, 0.09279651500037289, 0.0934514330001548]

12900
nlen_pm2r     : [0.04871700200237683, 0.04922142199939117, 0.050075729999662144]
nlen_miraj    : [0.09708551099902252, 0.09920390000115731, 0.12253420100023504]

375462
nlen_pm2r     : [0.05645462199754547, 0.056796938999468694, 0.05760099699909915]
nlen_miraj    : [0.1092712979989301, 0.10961470600159373, 0.11748507000083919]

12398765434324
nlen_pm2r     : [0.05949370799862663, 0.060066635000112, 0.06006887100011227]
nlen_miraj    : [0.20019025200235774, 0.20034944599683513, 0.2058156430030067]

These timings were obtained on my old 2GHz 32 bit machine running Python 3.6.0 on a Debian derivative of Linux.

Upvotes: 1

Miraj50

Reputation: 4407

Since this is a Math question, I would preferably use a more mathematical approach as it would use much less memory and will be much more efficient.

Here is my approach.

Calculate the length of the number.

If length = 1, you are guaranteed to have 9x1 characters.

If length = 2, you are guaranteed to have 9x1 + 90x2 characters.

If length = 3, you are guaranteed to have 9x1 + 90x2 + 900x3 characters and so on...

I hope you can see a loop forming here.

For the rest part, take an example of 1234, 1234 - 1000 = 234 numbers will have length = length of the number. No, Mistake : 234 +1 = 235 numbers will have length = length of the number !

Formulating it in a code.

n = input()
length = len(n)
x,s = 9,0
for i in range(1,length):
    s=s+x*i
    x = x*10 # 9-> 90-> 900->
print(s+(int(n)-10**(length-1)+1)*length) # 234 + 1
#         1234 - 10^3
#            = 234            +1
#                          = 235 * 4(length) = ...

>>> 1234
3829
>>>> 15
21

I also did import time to see efficiency. For n=9999999, your code took about 3.26 sec, while this one takes about 160-200 usec.