Volatil3
Volatil3

Reputation: 14978

Why am I unable to find html element with Python and Selenium?

I am having a weird issue with Python and Selenium. I am accessing the URL https://www.biggerpockets.com/users/JarridJ1. When you click more it shows further content. I can understand that it is a React-based website. When I view it on browser and doa View Source I can see the required stuff in a react element <div data-react-class="Profile/Header/Header" data-react-props="{&quot. I tried to automate Firefox via Selenium but I could not even get with that as well. Check the screenshot:

enter image description here Below is the code I tried:

from time import sleep

from selenium import webdriver
from selenium.webdriver.chrome.options import Options


def parse(u):
    print('Processing... {}'.format(u))
    driver.get(u)
    sleep(2)
    html = driver.page_source
    driver.save_screenshot('bp.png')
    print(html)


if __name__ == '__main__':
    options = Options()
    options.add_argument("--headless")  # Runs Chrome in headless mode.
    options.add_argument('--no-sandbox')  # Bypass OS security model
    options.add_argument('--disable-gpu')  # applicable to windows os only
    options.add_argument('start-maximized')  #
    options.add_argument('disable-infobars')
    options.add_argument("--disable-extensions")
    driver = webdriver.Firefox()
    parse('https://www.biggerpockets.com/users/JarridJ1')

Upvotes: 0

Views: 127

Answers (1)

Jortega
Jortega

Reputation: 3790

This is a tricky one but I found a way to get to the element you have highlighted. Still not sure why driver.page_source is not return what you are looking for.

def parse(u):
    print('Processing... {}'.format(u))
    driver.get(u)
    sleep(2)
    get_everything = driver.find_elements_by_xpath("//*")
    for element in get_everything:
        print(element .get_attribute('innerHTML'))

    #html = driver.page_source
    #driver.save_screenshot('bp.png')
    #print(html)

Below is my standalone example:

from selenium import webdriver
import time


driver = webdriver.Chrome("C:\Path\To\chromedriver.exe")
driver.get("https://www.biggerpockets.com/users/JarridJ1")
time.sleep(5)
a = driver.find_element_by_xpath("//div[@data-react-class='Profile/Header/Header']")
b = a.get_attribute("data-react-props")
print(b)
c = driver.find_elements_by_xpath("//*")
for i in c:
    print(i.get_attribute('innerHTML'))

Upvotes: 1

Related Questions