Sarla Devi
Sarla Devi

Reputation: 49

web scraping with selenium returns empty list

I worked on little bit web scraping before but I have not idea of javascript. I want to scrape "Company Name" and "description of the company" from https://www.ces.tech/Show-Floor/Exhibitor-Directory.aspx. I am using selenium for scraping but I don't want to use browser in background. I write some code here:

from selenium.webdriver.common.by import By
from selenium import webdriver
import os
op = webdriver.ChromeOptions()
op.add_argument('headless')
driver = webdriver.Chrome(options=op)
driver.get('https://www.ces.tech/Show-Floor/Exhibitor-Directory.aspx')
company = []
items = driver.find_elements(By.CLASS_NAME, "exhibitorCardModal")
for item in items:
    comp=item.find_elements(By.CLASS_NAME, "company-name")
    desc = item.find_elements(By.CLASS_NAME, "description")
    result_dict = {
        "company":comp.text,
        "description":desc.text
    }
    company.append(result_dict)

But got empty list. Can someone tell me what is wrong here. I also try to use there api https://www.ces.tech/api/Exhibitors?searchTerm=&sortBy=alpha&alpha=&state=&country=&venue=&exhibitorType=&pageNo=1&pageSize=30 but got this error :

{"error":{"code":"ApiVersionUnspecified","message":"An API version is required, but was not specified."}}

Upvotes: 0

Views: 480

Answers (1)

Prophet
Prophet

Reputation: 33371

  1. You also have to add wait / delay before accessing the elements to let the page completely loaded before you trying to access them.
  2. You should use find_element instead of find_elements for the loop internal commands:
comp=item.find_elements(By.CLASS_NAME, "company-name")
desc = item.find_elements(By.CLASS_NAME, "description")

So your code should be something like this:

from selenium.webdriver.common.by import By
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.support import expected_conditions as EC
from selenium import webdriver
import os
import time

op = webdriver.ChromeOptions()
op.add_argument('headless')
driver = webdriver.Chrome(options=op)
wait = WebDriverWait(driver, 20)

driver.get('https://www.ces.tech/Show-Floor/Exhibitor-Directory.aspx')

wait.until(EC.presence_of_element_located((By.CLASS_NAME, "exhibitorCardModal")))
time.sleep(0.5)
company = []
items = driver.find_elements(By.CLASS_NAME, "exhibitorCardModal")
for item in items:
    comp=item.find_element(By.CLASS_NAME, "company-name")
    desc = item.find_element(By.CLASS_NAME, "description")
    result_dict = {
        "company":comp.text,
        "description"::desc.text
    }
    company.append(result_dict)

Upvotes: 1

Related Questions