Nimitz14
Nimitz14

Reputation: 2328

Failing to get duration of youtube video using xpath

I wanted to write something that would return me the video duration of a youtube link. So I found requests and lxml and started out following this guide.

Here's the setup:

import requests
from lxml import html

url = 'https://www.youtube.com/watch?v=EN8fNb6uhns'
page = requests.get(url)
tree = html.fromstring(page.content)

Then I try and use xpath to get the duration, but it doesn't work. Trying to get the duration:

tree.xpath('//span[@class="ytp-time-duration"]/text()')

returns an empty list. But when I try and get the title (as a test) with:

tree.xpath('//h1[@class="watch-title-container"]/span/text()')

it works. When I use inspect to copy the xpath of the duration element nothing is returned:

tree.xpath('/html/body/div[2]/div[4]/div/div[4]/div[2]/div[2]/div/div[24]/div[2]/div[1]/div/span[3]')

When I do the same for the title it works again.

What is going on?

Upvotes: 2

Views: 658

Answers (2)

JyoGi108
JyoGi108

Reputation: 31

For YouTube the Xpath was not consistent. I got two different Xpaths (these are the 2 Xpaths I got for capturing the Video Duration)

//*[@id='movie_player']/div[5]/div/div/div[5]/button/div[1]

//*[@id="movie_player"]/div[26]/div[2]/div[1]/div/span[3]

Tried the option of finding the Element by Class name

FindElement(By.ClassName("ytp-time-duration"))

This worked always.

string VideoDuration = firfxdrivr.FindElement(By.ClassName("ytp-time-duration")).GetAttribute("textContent");

Console.WriteLine(VideoDuration);

Output: 19:18

Upvotes: 0

宏杰李
宏杰李

Reputation: 12158

span[@class="ytp-time-duration"]

this span tag is generated by JavaScript, and it will not returned by requests, requests just return the HTML code

Upvotes: 1

Related Questions