Reputation: 11
I am writing some code in python to find out the first natural number which does not appear in the first billion digits of pi. Here's what I've written:
import datetime
from tqdm import tqdm
def findpi(end):
notin = []
stop = 0
s1 = datetime.datetime.now()
with open(r"C:\Users\shamm\Desktop\Text Documents\1 BILLION Digits of pi.txt", 'rt') as p:
pi = str(p.read())
for i in tqdm(range(1,end+1)):
if str(i) not in pi:
if notin == []:
stop = i
notin.append(i)
s2 = datetime.datetime.now()
tdelta = s2 - s1
ts = tdelta.total_seconds()
return [notin, stop, ts]
pi = findpi(1000000)
print("Not in:", pi[0])
print("Last:", pi[1])
print("Time taken:", pi[2])
It works fine for small numbers, and when I tried it with the first million natural numbers, the code faces a sudden blockade. For the first 10 seconds, it runs at about 10k iterations/s, but then it suddenly goes down to 1k iterations/s. I tried it with a larger input of 10 million, and the same thing happens, running at 10k it/s only for 10 seconds. After 30+ minutes of running, it goes down to 100 it/s.
Is there a bottleneck which limits it from using too much memory or processing power? Or is something wrong in my code?
Edit: so it seems like the cause is the length of the ever-increasing length of the number which needs to be searched. How can I optimize this search so that it will not slow down with every increase in the number of digits?
Upvotes: 0
Views: 69
Reputation: 10466
That's expected. You slow down by factor 10 every time you reach a larger number of digits. Numbers with k+1 digits take about ten times longer to find than numbers with k digits. Because their first k digits are equally hard to find but then there's only a 10% chance that the extra digit also matches.
You could instead go through pi's digits, building the numbers it does contain, and record them. Then a second pass can quickly look up whether each number was seen:
with open(r"...", 'rb') as p:
pi = bytearray(p.read())
pi = pi.translate(pi.maketrans(b'0123456789', bytes(range(10))))
seen = bytearray(end+1)
while pi:
i = 0
for d in pi:
i = i * 10 + d
if i > end:
break
seen[i] = 1
del pi[0]
for i in tqdm(range(1,end+1)):
if not seen[i]:
if notin == []:
stop = i
notin.append(i)
With 10 million digits, findpi(1000000)
takes me about 6 seconds, so I estimate with a billion digits it will take about 10 minutes.
Upvotes: 0