Reputation: 2748
I have a small internal webpage that requires a log in. When logged in, a simple HTML page is loaded, and there are javascript scripts that load the actual content of the pages.
I want to:
I found that there is a package called requests_html that sounds like the goal is to be able to do something like this. I managed to use requests_html to log into the page and get the HTML view of the page I want. It should then be possible to call
response.html.render()
and requests_html should then use pyppeteer, that downloads and launches a headless chromium, loads the webpage, renders the page, and then returns back the result. This actually works, but it only returns the log in page. The session information from requests_html is not passed to pyppeteer and/or chromium.
Is it possible to use the same session, or do I need to try to log in using only pyppeteer?
Here is a code example, but you need a small webpage with form login and javascript rendering to try it on:
from requests_html import HTMLSession
from lxml import html
url = "https://example.com"
username = "[email protected]"
password = "hunter2"
session = HTMLSession()
payload = {
"input_user": username,
"input_password": password
}
response = session.post(url, data=payload)
# Logged in here
response = session.get(url)
response.html.render()
# Output from this shows login page
print(response.html.html)
Upvotes: 1
Views: 1904
Reputation: 155
The download of the github version (I suppose a less stable version) is not required. You can specify reload=False as follows:
response.html.render(reload=False)
Just saw that this is from 2019... I guess better late than never and yes that is what she said ;-)
Upvotes: 2
Reputation: 11
You can install the github version of requests-html and use the following parameter to render():
response.html.render(send_cookies_session=True)
This will maintain your login authorization from your session in the Chromium page instance used to render.
Upvotes: 1