Reputation: 41
We're trying to access an HTML page and get its content using Python. When it boils down to frame loading we face some problems. The code is:
URL = "http://192.168.1.48/_pnt_log.html"
username = "11111"
password ="1"
cj = cookielib.CookieJar()
opener = urllib2.build_opener(urllib2.HTTPCookieProcessor(cj))
login_data = urllib.urlencode({'username' : username, 'j_password' : password})
try:
opener.open('http://192.168.1.48/_top.html', login_data)
resp = opener.open('http://192.168.1.48/_dept.html?dn=1')
The HTML received is the following:
<html>
<head>
<meta http-equiv="content-type" content="text/html;charset=iso-8859-1">
<title>Remote UI<Additional Functions>: : imageRUNNER2520</title>
</head>
<frameset cols="175,*" bordercolor="white" border="0" framespacing="0" frameborder="0">
<frame src="index06_02.html" name="Menu" scrolling="AUTO" noresize>
<frame src="dept.html?dn=1" name="body" noresize>
<noframes>
<body bgcolor="white">
</body>
</noframes>
</frameset>
</html>
I want the content on dept.html?dn=1
which is not loaded with this request. Is there any way to get the content like a broswer does?
Upvotes: 0
Views: 1154
Reputation: 41
Finnaly the "problem" was on the how canon printer page keep the cookies and how open-close the sessions with urllib2.
I solved the problem used selenium python lib. http://selenium-python.readthedocs.io/
With selenium i take the html from browser and slide over the permisions problem because i work on the same session through the browser.
from selenium import webdriver
##OPEN BROSWER##
driver = webdriver.Firefox()
##LOGIN##
driver.get("http://192.168.1.48/_top.html")
driver.find_element_by_name('user_name').send_keys("11111")
driver.find_element_by_name('pwd').send_keys("1")
driver.find_element_by_xpath("/html/body/form/center/p[1]/table/tbody/tr[2]/td/table/tbody/tr[3]/td/table/tbody/tr/td/table/tbody/tr[13]/td[3]/a/img").click()
driver.get("http://192.168.1.48/dept.html?dn=1")
##GET HTML##
elem = driver.find_element_by_xpath("//*")
source_code = elem.get_attribute("outerHTML")
##SAVE HTML##
f = open('/home/itsoum/PrinterProject/html_source_code.html', 'w')
f.write(source_code.encode('utf-8'))
f.close()
driver.quit()
Upvotes: 1