Tejas Dhanani
Tejas Dhanani

Reputation: 35

How to extract all the number of views of each video resulted in Youtube search by Selenium?

What I want:

What I have tried

from selenium import webdriver

driver=webdriver.Chrome(executable_path='C:\\ProgramData\\chocolatey\\bin\\chromedriver.exe')
search = 'Believer from Imagine Dragons'
driver.get("https://www.youtube.com/results?search_query=" + search)

main = driver.find_elements_by_id("metadata")
for datas in main:
    info = datas.find_elements_by_id("metadata-line")
    for views in info:
        view_counts = views.find_element_by_xpath("""//*[@id="metadata-line"]/span[1]""")
        print('view_counts: ' + str(view_counts.text))

Output of this:

view_counts: 104M views
view_counts: 104M views
view_counts: 104M views
view_counts: 104M views
view_counts: 104M views
view_counts: 104M views
view_counts: 104M views
view_counts: 104M views
view_counts: 104M views
view_counts: 104M views
view_counts: 104M views
view_counts: 104M views
view_counts: 104M views
view_counts: 104M views
view_counts: 104M views
view_counts: 104M views
view_counts: 104M views

What I have also tried

from selenium import webdriver
from selenium.webdriver.common.by import By
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.support import expected_conditions as EC

driver=webdriver.Chrome(executable_path='C:\\ProgramData\\chocolatey\\bin\\chromedriver.exe')
search = 'Believer from Imagine Dragons'
driver.get("https://www.youtube.com/results?search_query=" + search)


main = WebDriverWait(driver, 10).until(
    EC.presence_of_element_located((By.ID, "metadata"))
)

data = main.find_elements_by_id("metadata-line")

for datas in data:
    views = datas.find_element_by_xpath("""//*[@id="metadata-line"]/span[1]""")
    print(views.text)

Output of this:

104M views

But, none of them gave me what I wanted. Please Help.

Future Goal (if you could help):

Upvotes: 2

Views: 676

Answers (1)

undetected Selenium
undetected Selenium

Reputation: 193078

To extract the texts e.g. TEXT, from each <span> using Selenium and you have to induce WebDriverWait for visibility_of_all_elements_located() and you can use either of the following Locator Strategies:

  • Using CSS_SELECTOR and get_attribute("innerHTML"):

    driver.get("https://www.youtube.com/results?search_query=Believer%20from%20Imagine%20Dragons")
    print([my_elem.get_attribute("innerHTML") for my_elem in WebDriverWait(driver, 20).until(EC.visibility_of_all_elements_located((By.CSS_SELECTOR, "div#metadata-line span:first-child")))])
    
  • Using XPATH and text attribute:

    driver.get("https://www.youtube.com/results?search_query=Believer%20from%20Imagine%20Dragons")
    print([my_elem.text for my_elem in WebDriverWait(driver, 20).until(EC.visibility_of_all_elements_located((By.XPATH, "//div[@id='metadata-line']/span[@class='style-scope ytd-video-meta-block' and contains(., 'views')]")))])
    
  • Console Output:

    ['1.5B views', '104M views', '32M views', '93M views', '98M views', '2.3M views', '39M views', '26M views', '1.4B views', '9.6M views', '6.7M views', '748K views', '1.3B views', '11M views', '84M views', '51M views', '13M views', '18M views', '197M views', '7.2M views', '79K views', '3.5M views']
    
  • Note : You have to add the following imports :

    from selenium.webdriver.support.ui import WebDriverWait
    from selenium.webdriver.common.by import By
    from selenium.webdriver.support import expected_conditions as EC
    

Outro

Link to useful documentation:

Upvotes: 1

Related Questions