sudoer
sudoer

Reputation: 155

Get amount of numbers with at least 1 occurence of 7 in range

I want to find out how many numbers contain at least one 7 in Python. For example:

Here is what I am using:

print(len([x for x in range(int(input())+1) if "7" in str(x)]))

It works, but extremely slow for large numbers. Is there any quicker method to calculate (note I used "calculate" instead of "find") how many numbers match the condition using mathematical algorithms?

A few things:

Please solve this for me, thanks!

Upvotes: 3

Views: 688

Answers (2)

sudoer
sudoer

Reputation: 155

I found this solution:

def count(n: int) -> int: # function to find numbers with 7
    if n < 7: return 0
    d: int = len(str(n)) - 1
    p: int = 10 ** d
    m: int = n // p # most significant digit
    t: int = p - 9 ** d # for example, count of 10^2 is 10^2-9^2
    if m == 7:  return m * t + n % p + 1 # if n = 778, return the count of 0-699 + 701-778 + 700
    elif m > 7:  return (m - 1) * t + p + count(n % p) # if n = 987, return (0-699 + 800-899) + 700-799 + 900-987
    else: return m * t + count(n % p) # if n = 123, return 0-99 + 100-123 (found by recursion)
    # return (m - (1 if m > 7 else 0)) * (10 ** d - 9 ** d) + (n % p + 1 if m == 7 else count(n % p) + (p if m > 7 else 0))
# def count(n: int) -> int: return 0 if n < 7 else (n // (10 ** (len(str(n)) - 1)) - (1 if n // (10 ** (len(str(n)) - 1)) > 7 else 0)) * (10 ** (len(str(n)) - 1) - 9 ** (len(str(n)) - 1)) + (n % (10 ** (len(str(n)) - 1)) + 1 if n // (10 ** (len(str(n)) - 1)) == 7 else count(n % (10 ** (len(str(n)) - 1))) + (10 ** (len(str(n)) - 1) if n // (10 ** (len(str(n)) - 1)) > 7 else 0))
print(count(int(input())))

Freely ignore the commented lines (just one-liners I made while I am bored).

The solution is way faster than the brute force solution.

Upvotes: 1

Nonlinear
Nonlinear

Reputation: 810

The time complexity of the brute force algorithm is o(n) if i am not mistaken. I found a way to make it a o(log10(n)) which I hope is fast enough. The algorithm works thanks to the folowing observation:

Let f(x) be the function we want.

f(20) = 2f(10)

f(30) = 3f(10)

but f(100) = 9f(10) + 10

because every number between 70 and 80 needs to be counted. We can deduce two important properties:

  1. f(x10y) = xf(10y)

if x < 7 and

  1. f(10x) = 9f(10x-1)+10x-1

which can be solved for our case:

  • f(x) = x-9ln(x)/ln(10)

But, this function only works for powers of 10, so we will call it g(x). This simplifies the problem a lot. Now we dont need to check every number up to x, but only the powers of 10 preceding it. I found a way to make the idea work in python but it is quite messy. If you can improve it or find a better formula, please let me know.

from math import log as ln
# I know that you said 'no modules' but I need logarithms :)

def g(x):
    return x - 9**(ln(x)/ln(10))

def step(x): # step function
    if x < 7:
        return 0
    else:
        return 1

def int_(x): # same as int() but returns 0 for ''
    if x == '':
        return 0
    else:
        return int(x)
    
def get_7s(num):
    answer = 0
    
    str_num = str(num)
    
    if '7' in str_num:
        split = str_num.split('7', 1) # cut the number at the first '7' occurence
        answer += int_(split[1]) + 1 # add the second half to the answer
        str_num = str(int(split[0] + '7' + (len(split[1]))*'0') - 1)
        # comes back to the integer before 
        # 1379 -> 1369
        # 18753 -> 18699
    else:
        pass
    
    for power_of_10, digit in enumerate(str_num[::-1]): # [::-1] lets us go power of 10 by power of 10
        digit = int(digit)
        answer += (digit - step(digit)) * g(10**power_of_10) + step(digit) * 10**power_of_10
        # the idea is to make property 1 work for all x.
        # Two modifications are necessary:
        # 'digit - step(digit)' to avoid counting 7s twice. 
        # '+ step(digit) * 10**power_of_10' because if step(digit) returns 1,
        # it means that a 7 was passed in the counting, so we need to add them back
    return answer

test:

print(len([x for x in range(8756+1) if "7" in str(x)]))
>> 3087
print(get_7s(8756))
>> 3087

And it is way faster than the original:

%timeit len([x for x in range(10_000_000) if "7" in str(x)])
>> 1.91 s ± 51 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)
%timeit get_7s(10_000_000)
>> 9.73 µs ± 406 ns per loop (mean ± std. dev. of 7 runs, 100000 loops each)

I hope this helps!

Upvotes: 0

Related Questions