Reputation: 343
Using requests.get(URL, allow_redirects=True)
for Youtube, Reddit or other modern webpages gives a bunch of unexecuted JavaScript, rather than actual HTML text content that I would see when I open the page with a browser.
I just need to get the title of the video. How can I do that in a light way, without starting something heavy like selenium or puppeteer and without using Youtube API?
Upvotes: 0
Views: 179
Reputation: 177
I was able to find the video title among all the Javascript and HTML.
>>> import re
>>> r = requests.get("https://www.youtube.com/watch?v=UjLnvXpkq68", allow_redirects=True)
>>> m = re.search(r'"title":"(.*?)"', r.text)
>>> m.group(1)
'DJ OKAWARI「Perfect Blue」'
Probably not the prettiest solution but using regex lets you avoid having to parse the entire document.
Upvotes: 1