Abhinav
Abhinav

Reputation: 3

Unable to scrape the src if it is nested inside the source tag inside video via python selenium and beautiful soup

I was scraping an anime website as a project but when I tried to scrape the src it gave me an error. The src is nested inside the source tag. I am giving the screenshot and code below.

example screenshot

Code :

    from selenium import webdriver
    from selenium.webdriver.common.keys import Keys
    from bs4 import BeautifulSoup
    import re
    from selenium.webdriver.common.by import By
    from selenium.webdriver.support.ui import WebDriverWait
    from selenium.webdriver.support import expected_conditions as EC

#launch url
url = "https://bestdubbedanime.com/Demon-Slayer-Kimetsu-no-Yaiba/26"

# create a new Firefox session
driver = webdriver.Firefox()
# driver.implicitly_wait(30)
driver.get(url)

# python_button = driver.find_element_by_class_name('playostki') #FHSU
# python_button.click() #click fhsu link

  soup1 = BeautifulSoup(driver.page_source, 'html.parser')

  video = soup1.find('video', id='my_video_1_html5_api')
  # video = driver.find_element_by_id('my_video_1_html5_api')
  WebDriverWait(driver, 20).until(EC.visibility_of_element_located((By.CSS_SELECTOR, ".playostki"))).click()      
   driver.stop_client
   driver.close
   driver.quit

Upvotes: 0

Views: 88

Answers (1)

QualityMatters
QualityMatters

Reputation: 911

The reason why you are not getting the src tag, because it is displayed after clicking the video. You have to first click on that video, and then try to find the attribute "src" from the element.

driver.maximize_window()
driver.get("https://bestdubbedanime.com/Demon-Slayer-Kimetsu-no-Yaiba/26")
WebDriverWait(driver, 60).until(EC.visibility_of_element_located((By.XPATH,  "//div[@class='playostki']//img"))).click()
print(WebDriverWait(driver, 60).until(EC.presence_of_element_located((By.CSS_SELECTOR, "#my_video_1_html5_api > source"))).get_attribute("src"))
driver.quit()

Output:

https://bestdubbedanime.com/xz/api/v.php?u=eVcxb0ZCUEMraFd1Vi9pM2xqWUhtbXZMWjZ0Mlpoc1U0Tmhqc2VFcVViQUc3VUVhR0pZV1EvaW1nY1duaXBMeXYvUUY4RG5ab3p4MEtEMUFHRmVaN0taVG9sY3ZVcTRoeDZoVHhWLzdiYjQ5UStNN2FYSjJBSWNKL0t5S1hLNGEyVlZqV1BYQ2MwaCsyNWcvak1Db01EMnNtWGwwTTBBVld4MkNER0V3eGNCRXJ0cEY4RHFPclhwbTJpWFBPSmJI

Upvotes: 1

Related Questions