Reputation: 93
I want to extract video information(like title, viewer's counts) of a certain Youtube video using python, just as I did web scraping on other websites. But for some reason, either it returns nothing or provides tags only for recommended videos on the side instead of "the main video" of the URL
I tried the same codes that I used for web-scraping on other websites as below. Apparently it doesn't work on Youtube. What should I do if I want to get video information based on a youtube URL?
import requests
from bs4 import BeautifulSoup
base_url ='https://www.youtube.com/watch?'
search_string = 'v=I41aLSzLI50'
url = base_url + search_string
supers=requests.get(url).content
data = BeautifulSoup(supers,'html.parser')
videos =data.find_all('a', class_= 'content-link spf-link yt-uix-sessionlink spf-link')
for video in videos:
print(video.find('span', class_='title').get_text())
Upvotes: 4
Views: 3786
Reputation: 53
I looked up a page on YouTube, and it seems that the you are looking for is not in the original source (at least not where you are expecting it). There are scripts that create the content when your browser renders the page. Based on my experience, you have a few options.
Use one of the APIs the commenters suggested. I am not very familiar with these, but it might same you some time and effort. Web scraping can be problematic because of changes in page format (scripts may need to be updated).
If you insist on web scraping, you can use an automated browser. I used to use Selenium on a regular basis and it should work for your purposes. This will allow you to work with content generated by scripts.
I looked at the page source, and the information you are looking for appears to be contained within some tags, but parsing this will be a pain.
Upvotes: 2