Importing a csv with a long list of urls to Python in order to find 404 errors

Question

I'm trying to do a simple task but I am very new to Python so would appreciate some help. I have this piece of code to find 404 errors in Python:

import requests

try:
    r = requests.head("http://stackoverflow.com")
    print r.status_code

except requests.ConnectionError:
    print "failed to connect"

Which I obtained by looking at solutions in stackoverflow (thanks to user Goumeau). I have thousands of urls in a csv which I would like to import and then run this code with. What I am looking for in the end is a list containing the url and the http status code associated with each url. The question is how do I import my list of urls and then run this code above in an iterate manner?

And if I'm lucky, how would I then obtain the list of answers?

Thanks for reading.

b10n · Accepted Answer

I'm assuming a file of urls, one per line.

def get_url_status(url):
    try:
        r = requests.head(url)
        return url, r.status_code
    except requests.ConnectionError:
        print "failed to connect"
        return url, 'error'

results = {}
with open('url.csv', 'rb') as infile:
    for url in infile:
        url_status = get_url_status(url)
        results[url_status[0]] = url_status[1]

Importing a csv with a long list of urls to Python in order to find 404 errors

Answers (1)

Related Questions