Reputation: 153
So, I'm trying to use Selenium to create a dictionary of N elements of youtube videos with their views from a youtube search. eg: {'videourl01':521,'videourl02':782} and yea, the key will be the video's url and the index will be the number of views, and there will be a total of N videos.
After landing on the search page and conducting the search what should be the next steps to achieve this.
Any and all help is highly appreciated :>
so far managed to get all video labels:
def GetTopVideosfromSearch(self,query,N):
query = query.replace(' ', '+')
self.browser.get('https://www.youtube.com/results?search_query='+query)
vids=self.browser.find_elements_by_id('video-title')
for vid in vids[0:N]:
print((vid.get_attribute("aria-label")))
Upvotes: 0
Views: 138
Reputation: 153
Found a solution which works for me
def GetTopVideosfromSearch(self,query,N):
query = query.replace(' ', '+')
self.browser.get('https://www.youtube.com/results?search_query='+query)
for _ in range(N-4):
self.browser.find_element_by_tag_name("body").send_keys(Keys.PAGE_DOWN)
time.sleep(0.1)
vids=self.browser.find_elements_by_id('video-title')
vidsDict={}
for vid in vids[0:N]:
tmp = vid.get_attribute("aria-label")
tmp=tmp[::-1]
s=0
views=''
for t in tmp:
if t==' ':
s+=1
if s==1 and t!=' ' and t!=',':
views+=t
views=int(views[::-1])
vidsDict[vid.get_attribute("href")] = views
return vidsDict
Upvotes: 0
Reputation:
I have used Selenium in the past, but to parse other websites.
First of all, you need to generate the content, since most likely YouTube is using ajax.
This can be achieved with:
Keys.PAGE_DOWN
Once you have generated the content, you must search in the resulting html, the element you are looking for.
In my case I was looking for price:
browser.find_elements_by_class_name("product-info-price")
Once you have it, you can iterate over like a loop and add the results to the dictionary:
Here is a complete snippet:
# imports
import pandas as pd
import requests
import time
import selenium
from bs4 import BeautifulSoup
from selenium import webdriver
from selenium.webdriver.common.keys import Keys
link = "https://es.wallapop.com/search?catIds=12461&dist=400&publishDate=any"
browser = webdriver.Chrome()
browser.get(link)
time.sleep(1)
body = browser.find_element_by_tag_name("body")
element = browser.find_element_by_class_name('Button')
browser.execute_script("arguments[0].click();", element)
# generate content, scrolling down the webpage
for _ in range(10):
body.send_keys(Keys.PAGE_DOWN)
time.sleep(0.1)
# iterate over the elements and append to the list
list_of_prices = []
for price in browser.find_elements_by_class_name("product-info-price"):
list_of_prices.append(price.text)
Upvotes: 1