Regex for absolute url

Question

I am searching quite a while for a regex compatible with Python's re module for finding all URLs in HTML document and I cannot find it except one that was only to able to check whether an url is valid or invalid (with match method). I want to do simple

import requests
html_response = requests.get('http://example.com').text
urls = url_pattern.findall(html_response)

I suppose needed regex (if exists) would be complex enough to take into consideration a bunch of special cases of urls so it cannot be some oneline code.

Anurag Verma · Accepted Answer

Use BeautifulSoup instead.It's simple to use and allows you to parse pages with HTML.

See this answer How to extract URLs from an HTML page in Python

Regex for absolute url

Answers (1)

Related Questions