glossarch
glossarch

Reputation: 272

Urllib and urllib2 returns "IOError: [Errno socket error] [Errno -2] Name or service not known" but Firefox downloads file with no trouble

I am trying to download GRIB data (binary weather forecast data) from the National Weather Service. I have written Python code to format the HTTP string to get data for today, looking 12 hours ahead.

The Python code returns the HTTP string, then attempts to use urllib.urlopen to download the data. Now, if I paste the HTTP string into Firefox, the GRIB file downloads. If I try to use urllib.urlopen, I get the following:

Traceback (most recent call last):
File "/home/dantayaga/bovine_aerospace/dev/grib_get.py", line 67, in <module>
webf=urllib.urlopen(griburl)
File "/usr/lib/python2.7/urllib.py", line 86, in urlopen
return opener.open(url)
File "/usr/lib/python2.7/urllib.py", line 207, in open
return getattr(self, name)(url)
File "/usr/lib/python2.7/urllib.py", line 344, in open_http
h.endheaders(data)
File "/usr/lib/python2.7/httplib.py", line 954, in endheaders
self._send_output(message_body)
File "/usr/lib/python2.7/httplib.py", line 814, in _send_output
self.send(msg)
File "/usr/lib/python2.7/httplib.py", line 776, in send
self.connect()
File "/usr/lib/python2.7/httplib.py", line 757, in connect
self.timeout, self.source_address)
File "/usr/lib/python2.7/socket.py", line 553, in create_connection
for res in getaddrinfo(host, port, 0, SOCK_STREAM):
IOError: [Errno socket error] [Errno -2] Name or service not known

Here is the HTTP string I am using:

http://nomads.ncep.noaa.gov/cgi-bin/filter_gfs_hd.pl?file=gfs.t06z.mastergrb2f12&lev_1000_mb=on&lev_975_mb=on&lev_950_mb=on&lev_925_mb=on&lev_900_mb=on&lev_850_mb=on&lev_800_mb=on&lev_750_mb=on&lev_700_mb=on&lev_650_mb=on&lev_600_mb=on&lev_550_mb=on&lev_500_mb=on&lev_450_mb=on&lev_400_mb=on&lev_350_mb=on&lev_300_mb=on&lev_250_mb=on&lev_200_mb=on&lev_150_mb=on&lev_100_mb=on&lev_70_mb=on&lev_30_mb=on&lev_20_mb=on&lev_10_mb=on&var_HGT=on&var_RH=on&var_TMP=on&var_UGRD=on&var_VGRD=on&var_VVEL=onleftlon=-90rightlon=90toplat=90bottomlat-90&dir=%2Fgfs.2012070706%2Fmaster

If you are testing this string in Firefox and it's not working, change "20120707" to today's date and "06" to "00" and it should work.

My question is simple (I think): why does this work in Firefox and not with urllib?

Here is the code I use to generate the http string and then attempt to download the result:

#Get GRIB files

import urllib

forecast_time='06' #What time the forecast is (00, 06, 12, 18)
forecast_hours='12' #How many hours ahead to forecast (2 or 3 digits)
forecast_date='20120707' #What date the forecast is for yyyymmdd

top_lat=90 #Top of bounding box (North)
bottom_lat=-90 #Bottom of bounding box (South)
left_lon=-90 #Left of bounding box (West)
right_lon=90 #Right of bounding box (East)

griburl='http://nomads.ncep.noaa.gov/cgi-bin/filter_gfs_hd.pl?'
griburl=griburl+'file=gfs.t'+str(forecast_time)+'z.mastergrb2f'
griburl=griburl+forecast_hours

#Select atmospheric levels

griburl=griburl+'&lev_1000_mb=on'  #1000 mb level
griburl=griburl+'&lev_975_mb=on'   #975 mb level
griburl=griburl+'&lev_950_mb=on'   #950 mb level
griburl=griburl+'&lev_925_mb=on'   #925 mb level
griburl=griburl+'&lev_900_mb=on'   #900 mb level
griburl=griburl+'&lev_850_mb=on'   #850 mb level
griburl=griburl+'&lev_800_mb=on'   #800 mb level
griburl=griburl+'&lev_750_mb=on'   #750 mb level
griburl=griburl+'&lev_700_mb=on'   #700 mb level
griburl=griburl+'&lev_650_mb=on'   #650 mb level
griburl=griburl+'&lev_600_mb=on'   #600 mb level
griburl=griburl+'&lev_550_mb=on'   #550 mb level
griburl=griburl+'&lev_500_mb=on'   #500 mb level
griburl=griburl+'&lev_450_mb=on'   #450 mb level
griburl=griburl+'&lev_400_mb=on'   #400 mb level
griburl=griburl+'&lev_350_mb=on'   #350 mb level
griburl=griburl+'&lev_300_mb=on'   #300 mb level
griburl=griburl+'&lev_250_mb=on'   #250 mb level
griburl=griburl+'&lev_200_mb=on'   #200 mb level
griburl=griburl+'&lev_150_mb=on'   #150 mb level
griburl=griburl+'&lev_100_mb=on'   #100 mb level
griburl=griburl+'&lev_70_mb=on'    #70 mb level
griburl=griburl+'&lev_30_mb=on'    #30 mb level
griburl=griburl+'&lev_20_mb=on'    #20 mb level
griburl=griburl+'&lev_10_mb=on'    #10 mb level

#Select variables

griburl=griburl+'&var_HGT=on'  #Height (geopotential m)
griburl=griburl+'&var_RH=on'  #Relative humidity (%)
griburl=griburl+'&var_TMP=on' #Temperature (K)
griburl=griburl+'&var_UGRD=on' #East-West component of wind (m/s)
griburl=griburl+'&var_VGRD=on' #North-South component of wind (m/s)
griburl=griburl+'&var_VVEL=on' #Vertical Windspeed (Pa/s)

#Select bounding box

griburl=griburl+'leftlon='+str(left_lon)
griburl=griburl+'rightlon='+str(right_lon)
griburl=griburl+'toplat='+str(top_lat)
griburl=griburl+'bottomlat'+str(bottom_lat)

#Select date and time

griburl=griburl+'&dir=%2Fgfs.'+forecast_date+forecast_time+'%2Fmaster'
print(griburl)
print('Downloading GRIB file for date '+forecast_date+' time ' +forecast_time + ',    forecasting '+forecast_hours+' hours ahead...')
webf=urllib.urlopen(griburl)
local_filename=forecast_date+'_'+forecast_time+'_'+forecast_hours+'.grib'
localf=open('//home//dantayaga//bovine_aerospace//grib//data//'+local_filename, 'wb')
localf.write(webf.read())
print('Requested grib data written to file '+local_filename)

Any help is most appreciated. Is there a formatting error that Firefox is catching or something?

Upvotes: 2

Views: 4327

Answers (1)

Samy Vilar
Samy Vilar

Reputation: 11100

try this:

import urllib2
import urllib

url = 'http://nomads.ncep.noaa.gov/cgi-bin/filter_gfs_hd.pl'

forecast_time = '06' #What time the forecast is (00, 06, 12, 18)
forecast_hours = '09' #How many hours ahead to forecast (2 or 3 digits)
forecast_date = '20120705' #What date the forecast is for yyyymmdd

get_parameters = {
    'subregion':'',
    'toplat':90, #Top of bounding box (North)
    'bottomlat':-90, #Bottom of bounding box (South)
    'leftlon':-90, #Left of bounding box (West)
    'rightlon':90, #Right of bounding box (East)
}

get_parameters['file'] = 'gfs.t' + forecast_time + 'z.mastergrb2f' + forecast_hours

on_variables = [
    'lev_1000_mb',
    'lev_975_mb',
    'lev_950_mb',
    'lev_925_mb',
    'lev_900_mb',
    'lev_850_mb',
    'lev_800_mb',
    'lev_750_mb',
    'lev_700_mb',
    'lev_650_mb',
    'lev_600_mb',
    'lev_550_mb',
    'lev_500_mb',
    'lev_450_mb',
    'lev_400_mb',
    'lev_350_mb',
    'lev_300_mb',
    'lev_250_mb',
    'lev_200_mb',
    'lev_150_mb',
    'lev_100_mb',
    'lev_70_mb',
    'lev_30_mb',
    'lev_20_mb',
    'lev_10_mb',

    'var_HGT',  #Height (geopotential m)
    'var_RH',  #Relative humidity (%)
    'var_TMP', #Temperature (K)
    'var_UGRD', #East-West component of wind (m/s)
    'var_VGRD', #North-South component of wind (m/s)
    'var_VVEL' #Vertical Windspeed (Pa/s)
]

get_parameters.update(dict((param, 'on') for param in on_variables))

#Select date and time
get_parameters['dir'] = '/gfs.' + forecast_date + forecast_time + '/master'

print('Downloading GRIB file for date '+forecast_date+' time ' +forecast_time + ',    forecasting '+forecast_hours+' hours ahead...')

req = urllib2.urlopen(url + '?' + urllib.urlencode(get_parameters), timeout = 300) # Theres bug in apache for non-used GET varaibles so we have to manually add them to the url ...
local_filename = forecast_date + '_' + forecast_time + '_' + forecast_hours + '.grib'
local_file = open('/home/dantayaga/bovine_aerospace/grib/data/' + local_filename, 'wb')

local_file.write(req.read())
local_file.close()

print('Requested grib data written to file ' + local_filename)

I've tested it with another file since the one you had in your original code, doesn't exist I get:

Data file is not present: /pub/data/nccf/com/gfs/prod/gfs.2012070706/master27/gfs.t06z.mastergrb2f12

try to use dictionaries, list, tuples or other complex data structures to store your parameters, this way you'll be able to detect subtle bugs much sooner, and reduce duplicate code.

urllib2.urlopen has an optional parameter called data thats used to pass parameter but unfortunately theres a bug in either apache or python since I keep getting raise IncompleteRead(value)

httplib: incomplete read
http://bugs.python.org/issue14044

Upvotes: 3

Related Questions