SamuraiBlue
SamuraiBlue

Reputation: 861

Python & Selenium: How to get values generated by JavaScript

I use Selenium in Python for scraping. I can't get values though these values are displayed on the browser.

So I checked the HTML source code, then I found that there are no values in HTML as below.

HTML

<div id="pos-list-body" class="list-body">

</div>

But there are values when I checked developer tool in chrome.

DevTools

<div id="pos-list-body" class="list-body">
    <div class="list-body-row" id="pos-row-1">
        <div class="pos-list-col-1">
            <input class="list-checkbox" type="checkbox" value="1">
        </div>
        <div class="detail-data pos-list-col-2">
            1
        </div>
        <div class="detail-data pos-list-col-3">
            a
        </div>
        ...
    </div>
    <div class="list-body-row" id="pos-row-2">
        <div class="pos-list-col-1">
            <input class="list-checkbox" type="checkbox" value="2">
        </div>
        <div class="detail-data pos-list-col-2">
            2
        </div>
        <div class="detail-data pos-list-col-3">
            b
        </div>
        ...
    </div>
    ...
</div>

It seems that these values generated by JavaScript or something.

There is no iframe in sorce code.

How can I get these values with python?

It would be appreciated if you could give me some hint.

Upvotes: 1

Views: 2266

Answers (3)

Asmoun
Asmoun

Reputation: 1747

based on other answers that seems to be not working as a solution to your issue, one possibility left which is there are more then one HTML element in the DOM that has the ID : pos-list-body, and I guess the first retrieved element by this ID is really empty and it is not your targeted element. Solution : try to select the <div> using Xpath instead of id, OR get all the elements with this id in a list and print the innerHTML of each one of them to get your targeted element index.

Upvotes: 0

undetected Selenium
undetected Selenium

Reputation: 193338

Element.outerHTML

The outerHTML attribute of the Element gets the serialized HTML fragment describing the element including its descendants. It can also be set to replace the element with nodes parsed from the given string. However to only obtain the HTML representation of the contents of an element ideally you need to use the innerHTML property instead. So reading the value of outerHTML returns a DOMString containing an HTML serialization of the element and its descendants. Setting the value of outerHTML replaces the element and all of its descendants with a new DOM tree constructed by parsing the specified htmlString.


Solution

To get the html generated by JavaScript you can use the following solution:

print(driver.execute_script("return document.getElementById('pos-list-body').outerHTML"))

Upvotes: 0

cruisepandey
cruisepandey

Reputation: 29382

If ID pos-list-body is unique in HTML-DOM, then your best bet is to use explicit wait with innerText

Code:

wait = WebDriverWait(driver, 20)
print(wait.until(EC.presence_of_element_located((By.ID, "pos-list-body"))).get_attribute('innerText'))

Imports:

from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.common.by import By
from selenium.webdriver.support import expected_conditions as EC

Upvotes: 1

Related Questions