Reputation: 195
Im trying to write a code, that will be able to verify domain through whois.domaintools.com.
But theres a little problem with reading the html, that do not match with whois.domaintools.com/notregistereddomain.com source code. Whats wrong? Its problem with requsting or what? I really dont know how to solve it.
import urllib2
def getPage():
url="http://whois.domaintools.com/notregistereddomain.com"
req = urllib2.Request(url)
try:
response = urllib2.urlopen(req)
return response.read()
except urllib2.HTTPError, error:
print "error: ", error.read()
a = error.read()
f = open("URL.txt", "a")
f.write(a)
f.close()
if __name__ == "__main__":
namesPage = getPage()
print namesPage
Upvotes: 0
Views: 965
Reputation: 7707
If you use print error
instead of print error.read()
, you'll see that you're getting a HTTP 403 "Forbidden" answer from the server.
Apparently this server doesn't like requests without a user-agent header (or it doesn't like Python's one because it doesn't want to be queried from a script). Here's a workaround:
user_agent = "Mozilla/4.0 (compatible; MSIE 5.5; Windows NT)" # Or any valid user agent from a real browser
headers = {"User-Agent": user_agent}
req = urllib2.Request(url, headers=headers)
res = urllib2.urlopen(req)
print res.read()
Upvotes: 2