Reputation: 1323
I have a program that scrapes a website and downloads files when it finds it. Often it runs just fine but at other times it flat out terminates the operation of the program before it is finishing searching the sequence. I'm stumped. It never quits while downloading only while searching. I'm currently guessing a socket error problem but like I said above I'm stumped. I've put in httperror checking and nothing gets displayed for an http error code. I'm trying right now to insert socket error checking and it gets even crazier. Prior to adding anything for socket error checking it works fine. Once I add in the socket error checking the program won't even run. It brings up the IDLE display and shows the cursor, like IDLE is ready and waiting for the next command. Otherwise without socket error checking it indicates that the program is running until either the tkinter window shuts down(err program terminates unexpectedly). If the tkinter window shuts downs it doesn't give any error on IDLE.
What do I have to do to find out why this program is terminating early at times and be able to trap it out so it won't terminate and just go back and rerun the same web address again. I think I have the rerunning the same web address taken care of but I don't have the socket error handling correct, if it's even socket error trouble. I'm stumped.
#!/usr/bin/python3.4
import urllib.request
import os
from tkinter import *
import time
import urllib.error
import errno
root = Tk()
root.title("photodownloader")
root.geometry("200x200")
app = Frame(root)
app.grid()
os.chdir('/home/someone/somewhere/')
Fileupdate = 10000
Filecount = 19999
while Fileupdate <= Filecount:
try:
root.title(Fileupdate)
url = 'http://www.webpage.com/photos/'+str(Fileupdate)+'.jpg'
a = urllib.request.urlopen(url)
urllib.request.urlretrieve(url, str(Fileupdate)+'.jpg')
except urllib.error.HTTPError as err:
if err.code == 404:
Fileupdate = Fileupdate + 1
root.update_idletasks()
continue
else:
print(err.code)
Fileupdate = Fileupdate + 1
root.update_idletasks()
continue
except socket.error, v:
print(Fileupdate, v[0])
continue
Fileupdate = Fileupdate+1
root.update_idletasks()
Upvotes: 1
Views: 1032
Reputation: 21453
I think the problem is caused by tkinter not given the chance to start it's main event loop which is done when you call root.mainloop()
, I'd recommend making the code you currently have in a while
loop instead to be a function that is periodically called with the root.after()
method. I have included a potential change to test if this would fix the issue.
Note that the lines:
Fileupdate = Fileupdate + 1
root.update_idletasks()
continue
in some except branches are redundant since that would happen if the code kept going anyway, so part of modifying the code to work in a function was to simply get rid of those parts. Here is the code I'd like you to try running starting from the original while
statement:
#-while Fileupdate <= Filecount:
def UPDATE_SOCKET():
global Fileupdate #allow the global variable to be changed
if Fileupdate <= Filecount:
#/+
try:
root.title(Fileupdate)
url = 'http://www.webpage.com/photos/'+str(Fileupdate)+'.jpg'
a = urllib.request.urlopen(url)
urllib.request.urlretrieve(url, str(Fileupdate)+'.jpg')
except urllib.error.HTTPError as err:
#<large redundant section removed>
print("error code",err.code)
except socket.error as v:
print("socket error",Fileupdate, v[0])
#- continue
root.after(500, UPDATE_SOCKET)
return
#/+
Fileupdate = Fileupdate+1
#- root.update_idletasks()
root.after(100, UPDATE_SOCKET) #after 100 milliseconds call the function again
#change 100 to a smaller time to call this more frequently.
root.after(0,UPDATE_SOCKET) #when it has a chance, call the update for first time
root.mainloop() #enter main event loop
#/+
I indicate changed lines with a #-
followed by the chunk that replaces it ending with a #/+
Upvotes: 1