François Richard
François Richard

Reputation: 7045

Python simple regex: get url name without http:// and /

Another noob question I would like to match the url without the http and and /

http://somesite.com/ ==> somesite.com

both for http and https

https://somesite.com/ ==> somesite.com

Apologize for the noob question

Upvotes: 0

Views: 738

Answers (2)

Jimmy
Jimmy

Reputation: 106

I realize it is not regex, but you could use the urlparse (urllib.parse in 3) module: https://docs.python.org/2/library/urlparse.html

The first function they describe will give the netloc, which can be split appropriately.

#! /usr/bin/python

from urlparse import urlparse
url = 'http://stackoverflow.com/questions/28100042/python-simple-regex-get-url-name-without-http-and'
parsed = urlparse(url)
site = parsed.netloc
print site

Upvotes: 1

moliware
moliware

Reputation: 10278

I would use urlparse instead

>>> import urlparse
>>> url = "http://somesite.com/"
>>> urlparse.urlparse(url).netloc
'somesite.com'

Upvotes: 3

Related Questions