LdM
LdM

Reputation: 704

Creating a dataframe with data extracted from a website

I am trying to get the number of sharing from Facebook section using selenium as follows:

from selenium import webdriver
import time

chrome_options = webdriver.ChromeOptions()

total = []
shares = []
comments = []
reactions = []
    
query=["www.stackoverflow.com","www.facebook.com"]
driver = webdriver.Chrome(mypath,chrome_options=chrome_options) 
driver.get('https://www.sharedcount.com/')
textarea = driver.find_element_by_xpath('//textarea')
for x in query:
     textarea.send_keys(query)
     button = driver.find_element_by_xpath("//button[@class='button button_accent-green']")
     button.click()
     body = driver.find_element_by_xpath("//tb-data tb-data_face[@class='bold-text']")
     total.append(body)
     sharing=driver.find_element_by_xpath("//tb-line-info__text tb-line-info__text_left tb-line-info__face[@class='bold-text']")
     shares.append(sharing)
     com=driver.find_element_by_xpath("//tb-line-info__text tb-line-info__text_left[@class='bold-text']")
     comments.append(com)

creating a new dataframe with information for each url in the query list. However, I'm getting this error

InvalidSelectorException: Message: invalid selector: Unable to locate an element with the xpath expression //tb-data tb-data_face[@class='bold-text'] because of the following error:
SyntaxError: Failed to execute 'evaluate' on 'Document': The string '//tb-data tb-data_face[@class='bold-text']' is not a valid XPath expression.
  (Session info: chrome=91.0.4472.114)

Does anyone know how fix error and have the following output?

url                       shares         comments         reactions 
www.stackoverflow.com      11.2k           3.8k              9.1k
www.facebook.com         1920.4m           64.9m             517.7m             

After using WebDriverWait I get this other error:

     24 button.click()
---> 25 WebDriverWait(driver, 10).until(EC.visibility_of_all_elements_located((By.XPATH, '//td[@class="tb-data tb-data-face"]/div[@class="bold-text"]')))
     26 body = driver.find_element_by_xpath('//td[@class="tb-data tb-data-face"]/div[@class="bold-text"]')
     27 total.append(body)

~/opt/anaconda3/lib/python3.8/site-packages/selenium/webdriver/support/wait.py in until(self, method, message)
     78             if time.time() > end_time:
     79                 break
---> 80         raise TimeoutException(message, screen, stacktrace)
     81 
     82     def until_not(self, method, message=''):

TimeoutException: Message: 

Upvotes: 1

Views: 82

Answers (1)

montovaneli
montovaneli

Reputation: 81

I believe you are missing the <td> and <div> tags in the XPath. It should be

'//td[@class="tb-data tb-data_face"]/div[@class="bold-text"]'

You also need to adjust the sharing and com XPaths using the same idea.

Edit 1: Try to wait for the visibility of element:

Import:

from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.support import expected_conditions as EC
from selenium.webdriver.common.by import By
for x in query:
     ...
     button.click()
     
     # Wait 10 seconds for the element to be visible
     WebDriverWait(driver, 10).until(EC.visibility_of_all_elements_located((By.XPATH, '//td[@class="tb-data tb-data_face"]/div[@class="bold-text"]')))

     body = driver.find_element_by_xpath('//td[@class="tb-data tb-data_face"]/div[@class="bold-text"]')
     total.append(body)
     ...

Upvotes: 1

Related Questions