How to use beautifulsoup to get redirect html?

Question

I'm looking at a web file with the following header. How could I get the content of google.com page with bs4?

Thanks!

Antti Haapala -- Слава Україні · Accepted Answer

Use find with tag name meta, and attrs having the known fixed attribute, namely http-equiv needs to have value of refresh. Get the first such element from the result set, and take the value of its 'content' attribute, then parse it for url.

Thus you get:

>>> fragment = """"""
>>> soup = BeautifulSoup(fragment)
>>> element = soup.find('meta', attrs={'http-equiv': 'refresh'})
>>> element


>>> refresh_content = element['content']
>>> refresh_content
u'5;url=http://google.com'

>>> url = refresh_content.partition('=')[2]
>>> url
u'http://google.com'

How to use beautifulsoup to get redirect html?

Answers (1)

Related Questions