Will
Will

Reputation: 255

Python check if webpage is HTTP or HTTPS

I am working with websites in my script and I am looking to see if websites accept HTTP or HTTPS I have the below code but it doesn't appear to give me any response. If there is a way i can find out if a site aspect's HTTP or HTTPS then tell it to do something?

from urllib.parse import urlparse
import http.client
import sys


def check_url(url):
    url = urlparse(url)
    conn = http.client.HTTPConnection(url.netloc)
    conn.request('HEAD', url.path)
    if conn.getresponse():
        return True
    else:
        return False


if __name__ == '__name__':
    url = 'http://stackoverflow.com'
    url_https = 'https://' + url.split('//')[1]
    if check_url(url_https):
        print 'Nice, you can load it with https'
    else:
        if check_url(url):
            print 'https didnt load but you can use http'
    if check_url(url):
        print 'Nice, it does load with http too'

Typo in code.. if name == 'name': should be if name == 'main':

Upvotes: 1

Views: 2319

Answers (4)

Kishore Sampath
Kishore Sampath

Reputation: 1001

Try changing if __name__ == '__name__': to if __name__ == '__main__':

I have also refactored the code and implemented my solution in python 3. HTTPConnection class is not checking whether the website is using http or https, it returns true for both HTTP and HTTPS websites, so I have used the HTTPSConnection class.

from urllib.parse import urlparse
from http.client import HTTPConnection, HTTPSConnection

BASE_URL = 'stackoverflow.com'

def check_https_url(url):
    HTTPS_URL = f'https://{url}'
    try:
        HTTPS_URL = urlparse(HTTPS_URL)
        connection = HTTPSConnection(HTTPS_URL.netloc, timeout=2)
        connection.request('HEAD', HTTPS_URL.path)
        if connection.getresponse():
            return True
        else:
            return False
    except:
        return False

def check_http_url(url):
    HTTP_URL = f'http://{url}'
    try:
        HTTP_URL = urlparse(HTTP_URL)
        connection = HTTPConnection(HTTP_URL.netloc)
        connection.request('HEAD', HTTP_URL.path)
        if connection.getresponse():
            return True
        else:
            return False
    except:
        return False

if __name__ == "__main__":
    if check_https_url(BASE_URL):
        print("Nice, you can load the website with HTTPS")
    elif check_http_url(BASE_URL):
        print("HTTPS didn't load the website, but you can use HTTP")
    else:
        print("Both HTTP and HTTPS did not load the website, check whether your url is malformed.")

Upvotes: 4

Krishna Chaurasia
Krishna Chaurasia

Reputation: 9572

Your code has a typo in line if __name__ == '__name__':.

Changing it to if __name__ == '__main__': solves the problem.

Upvotes: 4

Icebreaker454
Icebreaker454

Reputation: 1071

The fundamental issues with your script are as follows:

  • The urllib.parse module was introduced in Python3. In Python2 there was the urlparse module for that - url.parse Python2.7 equivalent. I assumed you are running on Python2, because of the print statements without parentheses.
  • An if-main construct should look like if __name__ == '__main__': instead of if __name__ == '__name__'.

I tried the following snipped on Python3, and it wokred out pretty well.

from urllib.parse import urlparse
import http.client
import sys


def check_url(url):
    url = urlparse(url)
    conn = http.client.HTTPConnection(url.netloc)
    conn.request('HEAD', url.path)
    if conn.getresponse():
        return True
    else:
        return False


if __name__ == '__main__':
    url = 'http://stackoverflow.com'
    url_https = 'https://' + url.split('//')[1]
    if check_url(url_https):
        print('Nice, you can load it with https')
    else:
        if check_url(url):
            print('https didnt load but you can use http')
    if check_url(url):
        print('Nice, it does load with http too')

Upvotes: 2

Josh
Josh

Reputation: 159

I think your problem is if __name__ == '__name__': I assume it will work for you like this: if __name__ == '__main__':

Upvotes: 1

Related Questions