Reputation: 27
I have a web scraping python script that when you run , it asks for a web address. What I want to happen is to validate the users input eg. if it's a valid web address or when there is no input from the user. I have done the try and except which almost works, it displays the message that I want the user to see but it also returns Traceback calls and I dont want that. I only want to display my custom error message. Could anyone help me to implement this? Here's my code:
import sys, urllib, urllib2
try:
url= raw_input('Please input address: ')
webpage=urllib.urlopen(url)
print 'Web address is valid'
except:
print 'No input or wrong url format usage: http://wwww.domainname.com '
def wget(webpage):
print '[*] Fetching webpage...\n'
page = webpage.read()
return page
def main():
sys.argv.append(webpage)
if len(sys.argv) != 2:
print '[-] Usage: webpage_get URL'
return
print wget(sys.argv[1])
if __name__ == '__main__':
main()
Upvotes: 2
Views: 4304
Reputation: 954
You can simply do:
try:
# ...
except Exception as e:
print("What you want to show")
Edit: "How do I stop it from executing when it reach an exception?"
You can either have try
and except
in wget()
as @sabujhassan mentioned or you can exit on catching the exception:
except Exception as e:
print("Exception caught!")
exit(1)
Edit 2: "is it possible to loop the program eg. when there is no user input, just keep asking the user to input a web address?" Yes, You can simply cover it under infinite while loop and break when the right value is selected.
while True:
try:
# Your logic ...
break
except:
print 'No input or wrong url format usage: http://wwww.domainname.com '
print 'Try again!'
Upvotes: 3
Reputation: 3397
Try this (replaces lines 3-8) :
def main(url = None) :
if not url : # no arguments through sys.argv
url= raw_input('Please input address: ')
if not url : # No arguments from the user
try :
webpage=urllib.urlopen(url)
except : # Funky arguments from the user
print 'Web address is valid'
finally :
wget(webpage)
else :
print 'No input or wrong url format usage: http://wwww.domainname.com '
For the latter half, (from main onwards) :
if __name__ == '__main__':
if len(sys.argv) == 2 :
main(sys.argv[1])
else :
main()
(I disapprove of the pythonic 4 spaces rule, I keep on having to replace spacebars)
Upvotes: 0
Reputation: 101
You perform the initial try/except, but you're not exiting once the exception is caught. The problem is the webpage will only be filled in when something is passed in, so it fails later since "webpage" has not been defined yet, so the answer is to quit once the exception is thrown.
So:
try:
url= raw_input('Please input address: ')
webpage=urllib.urlopen(url)
print 'Web address is valid'
except:
print 'No input or wrong url format usage: http://wwww.domainname.com '
sys.exit(1)
Upvotes: 1
Reputation: 39355
use try except for both the function wget() and main(). for example:
def wget(webpage):
try:
print '[*] Fetching webpage...\n'
page = webpage.read()
return page
except:
print "exception!"
Upvotes: 1