Reputation: 243
Issue: As the title states I am downloading data via ftp from NOAA based on the year and the day. I have configured my script to go through a range of years and download data for each day. However the script is getting hung up on days where no file exists. What happens is it just keeps reloading the same line saying that the file does not exist. Without the time.sleep(5) the script prints to the log like crazy.
Solution: Somehow skip the missing day and move onto the next one. I have explored continue (maybe I am placing it in the wrong spot), making an empty directory (not elegant and still will not move past missing day). I am at a loss, what have I overlooked?
Here is the script:
##Working 24km
import urllib2
import time
import os
import os.path
flink = 'ftp://sidads.colorado.edu/DATASETS/NOAA/G02156/24km/{year}/ims{year}{day}_24km_v1.1.asc.gz'
days = [str(d).zfill(3) for d in range(1,365,1)]
years = range(1998,1999)
flinks = [flink.format(year=year,day=day) for year in years for day in days]
from urllib2 import Request, urlopen, URLError
for fname in flinks:
dl = False
while dl == False:
try:
# req = urllib2.Request(fname)
req = urllib2.urlopen(fname)
with open('/Users/username/Desktop/scripts_hpc/scratch/'+fname.split('/')[-1], 'w') as dfile:
dfile.write(req.read())
print 'file downloaded'
dl = True
except URLError, e:
#print 'sleeping'
print e.reason
#print req.info()
print 'skipping day: ', fname.split('/')[-1],' was not processed for ims'
continue
'''
if not os.path.isfile(fname):
f = open('/Users/username/Desktop/scripts_hpc/empty/'+fname.split('/')[-1], 'w')
print 'day was skipped'
'''
time.sleep(5)
else:
break
#everything is fine
Research: I have browsed through other questions and they get close, but don't seem to hit the nail on the head. Ignore missing files Python ftplib ,how to skip over a lines of a file if they aempty Any help would be greatly appreciated!
Thank you!
Upvotes: 1
Views: 320
Reputation: 95
On the except
, use pass
instead of continue
, since it can only be used inside loops(for
, while
).
With that you won't need to handle the missing files, since Python will just ignore the error and keep going.
Upvotes: 1
Reputation: 243
I guess when you stand up walk away and get some coffee things become clear. Apparently something was getting hung up in my while statement (still unsure why). When I took that out and added pass instead of continue it behaved correctly.
Here's what it looks like now:
for fname in flinks:
try:
req = urllib2.urlopen(fname)
with open('/Users/username/Desktop/scripts_hpc/scratch/'+fname.split('/')[-1], 'w') as dfile:
dfile.write(req.read())
print 'file downloaded'
except URLError, e:
print e.reason
print 'skipping day: ', fname.split('/')[-1],' was not processed for ims'
pass
time.sleep(5)
Upvotes: 1