Reputation: 33
I am a newbie. i would like to know how to web scrape YouTube comments using BeautifulSoup . I am struck over here. can any one help me with the code.
here is what i have written :
import requests
from bs4 import BeautifulSoup
r = requests.get("https://www.youtube.com/watch?v=kffacxfA7G4"
req =r.conten
soup = BeautifulSoup(req,'html.parser')
print(soup.prettify())
all = soup.find_all('div',{'id' : 'contents'})
I was stuck here not getting any output, inspecting the wb page it showed comments has id = contents
Upvotes: 3
Views: 5723
Reputation: 22440
The comment of that site are generated dynamically. You can't get them using the main link making use of requests
and BeautifulSoup
library. To get the content tracking the above link you need to use any browser simulator like selenium
. As a starter, you can try like below. The following script will fetch you the unwrapped comments. Btw, the site also has got lazyloading method active so you need to twitch the for loop
to get more content.
import time
from selenium.webdriver import Chrome
from selenium.webdriver.common.by import By
from selenium.webdriver.common.keys import Keys
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.support import expected_conditions as EC
with Chrome() as driver:
wait = WebDriverWait(driver,10)
driver.get("https://www.youtube.com/watch?v=kffacxfA7G4")
for item in range(3): #by increasing the highest range you can get more content
wait.until(EC.visibility_of_element_located((By.TAG_NAME, "body"))).send_keys(Keys.END)
time.sleep(3)
for comment in wait.until(EC.presence_of_all_elements_located((By.CSS_SELECTOR, "#comment #content-text"))):
print(comment.text)
Partial output:
15 April 2018 ?¿?
April 2018??
8 years people 👌
Nice songs Justin Bieber https://youtu.be/OvfAc7JGoc4
2018 hit like...♥️♥️♥️♥️😁👌🏻
8 years complete 🙏
Can likes beat dislikes??
View 1, 8 billion great song
Upvotes: 6