Yuras
Yuras

Reputation: 482

Regex for absolute url

I am searching quite a while for a regex compatible with Python's re module for finding all URLs in HTML document and I cannot find it except one that was only to able to check whether an url is valid or invalid (with match method). I want to do simple

import requests
html_response = requests.get('http://example.com').text
urls = url_pattern.findall(html_response)

I suppose needed regex (if exists) would be complex enough to take into consideration a bunch of special cases of urls so it cannot be some oneline code.

Upvotes: 1

Views: 292

Answers (1)

Anurag Verma
Anurag Verma

Reputation: 495

Use BeautifulSoup instead.It's simple to use and allows you to parse pages with HTML.

See this answer How to extract URLs from an HTML page in Python

Upvotes: 4

Related Questions