Headdonkeyy
Headdonkeyy

Reputation: 1

Using find_all in BeautifulSoup to grab Class

Im trying to grab information from the NYSE, specifically the class "flex_tr" with the html path of https://www.nyse.com/quote/XNGS:AAPL

html->body->div->div.sticky-header__main->div.landing-section->div.idc-container->div->div->div.row->div.col-lg-12.col-md-12->div.d-widget.d-vbox.d-flex1.DataTable-nyse->div.d-container.d-flex1.d-vbox.d-nowrap.d-justify-start.data-table-container.d-noscroll->div.d-flex1->div.d-vbox->div.d-flex-1.d-scroll-y->div.contentContainer->div.flex_tr

There are a ton of rows that this should be grabbing but i'm unable to get the contents of any currently. I've tried soup.find_all("div", class_="flex_tr") and also soup.find_all("div", {"class": "flex_tr"}) and nothing seems to be able to grab the information.

from selenium import webdriver

from bs4 import BeautifulSoup

driver = webdriver.Chrome("C:\Program Files (x86)\Google\Chrome\Application\chromedriver.exe")

driver.get('https://www.nyse.com/quote/XNGS:AAPL')

content = driver.page_source
soup = BeautifulSoup(content, 'html.parser')

flex_tr = soup.find_all(class_="flex_tr")
print(flex_tr)

driver.close()

Upvotes: 0

Views: 440

Answers (1)

kbeloin
kbeloin

Reputation: 199

It looks like you're closing the driver before the element is loaded to the page (the return value is an empty array).

Selenium includes a few modules that allow you to wait for an element to load. This question here talks more about this: Wait until page is loaded with Selenium WebDriver for Python

As for your question, I was able to get this working with the following (this mirrors the top answer in the aforementioned link):

from selenium import webdriver
from bs4 import BeautifulSoup
# from selenium import webdriver
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.support import expected_conditions
from selenium.webdriver.common.by import By


driver = webdriver.Chrome('./chromedriver')

driver.get('https://www.nyse.com/quote/XNGS:AAPL')

delay = 5 # of seconds for the WebDriverWait param

try:
    element = WebDriverWait(driver, delay).until(expected_conditions.presence_of_element_located((By.CLASS_NAME, "flex_tr")))
    content = driver.page_source
    soup = BeautifulSoup(content, 'html.parser')
    flex_tr = soup.find_all(class_="flex_tr")
    print(flex_tr)
    
except:
    print ("Timeout")

driver.close()

Upvotes: 1

Related Questions