Baobab1988
Baobab1988

Reputation: 715

How to check URL status for multiple URLs stored in a CSV file and save results to a new CSV file

I'm new to python and currently trying to achieve the following:

I want to check HTTP response status codes for multiple URLs in my input.csv file:

id    url
1    https://www.google.com
2    https://www.example.com
3    https://www.testtesttest.com
...

and save results as an additional column 'status' flagging those URLs that are down or with some other issues in my output.csv file:

id    url                            status
1    https://www.google.com          All good!
2    https://www.example.com         All good!
3    https://www.testt75esttest.com    Down
...

so far I was trying the following, but unsuccessfully::

import requests
import pandas as pd
import requests.exceptions

df = pd.read_csv('path/to/my/input.csv')

urls = df.T.values.tolist()[1]


try:
    r = requests.get(urls)
    r.raise_for_status()  
except (requests.exceptions.ConnectionError, requests.exceptions.Timeout):
    print "Down"
except requests.exceptions.HTTPError:
    print "4xx, 5xx"
else:
    print "All good!"

not sure how I could get results for the above and save as a new column in the output.csv file:

df['status'] = #here the result 
df.to_csv('path/to/my/output.csv', index=False)

Would someone be able to help with this? Thanks in advance!

Upvotes: 1

Views: 1130

Answers (1)

David Erickson
David Erickson

Reputation: 16683

id  url
1   https://www.google.com
2   https://www.example.com
3   https://www.testtesttest.com

Copy the above to clipboard. Then, run the below code. You need to loop through the urls and append the status to a list. Then, set the list as a new column.

import requests
import pandas as pd
import requests.exceptions
df = pd.read_clipboard()
df

urls = df['url'].tolist()
status = []
for url in urls:
    try:
        r = requests.get(url)
        r.raise_for_status()
    except (requests.exceptions.ConnectionError, requests.exceptions.Timeout):
        status.append("Down")
    except requests.exceptions.HTTPError:
        status.append("4xx, 5xx")
    else:
        status.append("All good!")
df['status'] = status
df.to_csv('path/to/my/output.csv', index=False)

Upvotes: 2

Related Questions