Reputation: 129
I want to extract some data from a Javascript rendered page using Selenium web driver in Python3. I have try several driver, such as Firefox, Chromedriver, and PhantomJS, but always get the same result. Instead of the DOM element, I only got the script.
Here is the snippet of my code
url = 'https://www.google.com/flights/explore/#explore;f=BDO;t=r-Asia-0x88d9b427c383bc81%253A0xb947211a2643e5ac;li=0;lx=2;d=2018-01-09'
driver = webdriver.Chrome("/var/chromedriver/chromedriver")
driver.implicitly_wait(20)
driver.get(url)
print(driver.page_source)
Do I miss something here ?
Upvotes: 2
Views: 2414
Reputation: 1
use helium a selenium wraper
# pip install helium
import helium, time
url_one = "https://www.vbiz.in/nseoptionchain.html"
browser_one = helium.start_chrome(url_one, headless=True)
seconds = 5
time.sleep(seconds)
html = browser_one.page_source
browser_one.close()
Upvotes: 0
Reputation: 193088
I don't see any such issues in your code block. I have tried your own script as follows :
from selenium import webdriver
url = 'https://www.google.com/flights/explore/#explore;f=BDO;t=r-Asia-0x88d9b427c383bc81%253A0xb947211a2643e5ac;li=0;lx=2;d=2018-01-09'
driver = webdriver.Chrome()
driver.get(url)
print(driver.page_source)
I get the following Console Output :
<!DOCTYPE html>
<html xmlns="http://www.w3.org/1999/xhtml" lang="en-US">
<head>
<meta http-equiv="content-type" content="text/html; charset=UTF-8" />
<meta name="deals::gwt:property" content="baseUrl=/flights/explore//static/" />
<title>Explore flights</title>
<meta name="description" content="Explore flights" />
<script src="https://apis.google.com/_/scs/abc-static/_/js/k=gapi.gapi.en.yoTdpQipo6s.O/m=gapi_iframes,googleapis_client,plusone/rt=j/sv=1/d=1/ed=1/am=AAE/rs=AHpOoo9_VhuRoUovwpPPf5LqLZd-dmCnxw/cb=gapi.loaded_0" async=""></script>
<script language="javascript" type="text/javascript">
var __JS_ILT__ = new Date();
.
.
. <
/div></div > < div aria - hidden = "true"
style = "display: none;" > < div class = "CTPFVNB-l-j CTPFVNB-l-h" > Displayed currencies may differ from the currencies used to purchase flights.– < a href = "https://www.google.com/intl/en/googlefinance/disclaimer/"
class = "CTPFVNB-l-k" > Disclaimer < /a></div > < /div><div aria-hidden="true" style="display: none;"><div class="CTPFVNB-l-j CTPFVNB-l-h">Showing licensed rail data. – <a href="https:/ / www.google.com / intl / en / help / legalnotices_maps.html " class="
CTPFVNB - l - k ">Legal Notice</a></div></div><div class="
CTPFVNB - l - i "><a class="
CTPFVNB - l - k CTPFVNB - l - j " href="
https: //www.google.com/intl/en/policies/">Privacy & Terms</a><a class="CTPFVNB-l-k CTPFVNB-l-j" href="https://support.google.com/flights/?hl=en">Help Center</a></div></div></div><iframe id="deals" tabindex="-1" style="position: absolute; width: 0px; height: 0px; border: none; left: -1000px; top: -1000px;">
</iframe><input type="text" id="_bgInput" style="display:none;" /></body></html>
Now, as you can clearly see at the fag end of the page_source there is an iframe. So untill and unless we switch to the iframe you won't be able to find the DOM element
you are looking for.
Upvotes: 1