Reputation: 129
I want to parse list of price in a web page using Selenium webdriver in Python. So, I try to fetch all the DOM elements using this code
url = 'https://www.google.com/flights/explore/#explore;f=BDO;t=r-Asia-0x88d9b427c383bc81%253A0xb947211a2643e5ac;li=0;lx=2;d=2018-01-09'
driver = webdriver.Chrome()
driver.get(url)
print(driver.page_source)
The problem is what I got from page_source
is different from what I see in the inspected element
<div class="CTPFVNB-f-a">
<div class="CTPFVNB-f-c"></div>
<div class="CTPFVNB-f-d elt="toolbelt"></div>
<div class="CTPFVNB-f-e" elt="result">Here is the difference</div>
</div>
The difference exist inside the CTPFVNB-f-e
class. In the inspected DOM element, this tag hold all the prices that I want to fetch. But, in the result of page_source
, this part is missing.
Could anyone tell me what is wrong with my code? Or do I need further steps to parse the list of prices?
Upvotes: 1
Views: 455
Reputation: 7238
JavaScript is modifying the page after the page loads. As you are printing page source immediately after opening the page, you're getting the initial code without the execution of JavaScript.
You can do any one of the following things:
time.sleep(x)
(change value of x
according to your requirements. it is in seconds) (NOT recommended)driver.implicitly_wait(x)
(again x
is same as above)Using explicit wait is the better option here as it waits only for the time required for the element to become visible. Thus won't cause any excess delays. Or if the page loads slower than expected, you won't get the desired output using implicit wait.
Upvotes: 2