Samuel G.
Samuel G.

Reputation: 61

Web Scraping youtube with Python 3

I'm doing a project where I need to store the date that a video in youtube was published.
The problem is that I'm having some difficulties trying to find this data in the middle of the HTML source code

Here's my code attempt:

import requests
from bs4 import BeautifulSoup as BS

url = "https://www.youtube.com/watch?v=XQgXKtPSzUI&t=915s"
response = requests.get(url)
soup = BS(response.content, "html.parser")
response.close()

dia = soup.find_all('span',{'class':'date'})
print(dia)

Output:

[]

I know that the arguments I'm sending to .find_all() are wrong.
I'm saying this because I was able to store other information from the video using the same code, such as the title and the views.
I've tried different arguments when using .find_all() but didn't figured out how to find it.

Upvotes: 1

Views: 3625

Answers (3)

j_nim
j_nim

Reputation: 1

Try adding attribute as shown below:

dia = soup.find_all('span', attr={'class':'date'})

Upvotes: 0

JakeJ
JakeJ

Reputation: 2511

If you use Python with pafy, the object you'll get has the published date easily accessible.

Install pafy: "pip install pafy"

import pafy
vid = pafy.new("www.youtube.com/watch?v=2342342whatever")
published_date = vid.published
print(published_date)   #Python3 print statement

Check out the pafy docs for more info: https://pythonhosted.org/Pafy/ The reason I leave the doc link is because it's a really neat module, it handles getting the data without external request modules and also exposes a bunch of other useful properties of the video, like the best format download link, etc.

Upvotes: 3

AimiHat
AimiHat

Reputation: 383

It seems that YouTube is using javascript to add the date, so that information is not in the source code. You should try using Selenium to scrape, or get the date from the js since it is directly in the source code.

Upvotes: 0

Related Questions