Reputation: 101
Long story short : switched from Selenium to Requests(-html).
Works OK but not in every case.
Page : https://www.winamax.fr/paris-sportifs/sports/1/1/1
Upon load it charges dynamic content with english games (example : Sheffield United - West Ham).
But when I try to do this :
from requests_html import HTMLSession
session = HTMLSession()
r = session.get('https://www.winamax.fr/paris-sportifs/1/1/1')
r.html.render()
print(r.html.text) # I also tried print(r.html.html)
the games don't show in the output.
Why ? Thanks !
Upvotes: 4
Views: 9294
Reputation: 53
I found that using the sleep
parameter in the render function to wait for a few seconds before rendering was the only thing that worked for me:
from requests_html import HTMLSession
session = HTMLSession()
r = session.get('https://www.winamax.fr/paris-sportifs/sports/1/1/1')
r.html.render(sleep=10)
print(r.html.html)
session.close()
From the requests-html documentation:
render(retries: int = 8, script: str = None, wait: float = 0.2, scrolldown=False, **sleep: int = 0**, reload: bool = True, timeout: Union[float, int] = 8.0, keep_page: bool = False, cookies: list = [{}], send_cookies_session: bool = False)[source]
Reloads the response in Chromium, and replaces HTML content with an updated version, with JavaScript executed.
Parameters:
retries
– The number of times to retry loading the page in Chromium.script
– JavaScript to execute upon page load (optional).wait
– The number of seconds to wait before loading the page, preventing
timeouts (optional).scrolldown
– Integer, if provided, of how many
times to page down.sleep
– Integer, if provided, of how many seconds to sleep after initial render.reload
– If False, content will not be loaded from
the browser, but will be provided from memory.keep_page
– If True will allow you to interact with the browser page through r.html.page
.send_cookies_session
– If True send HTMLSession.cookies convert.cookies
– If not empty send cookies.Upvotes: 1
Reputation: 308
add timeout, it should work, sorry this must be a comment but I cannot comment..
from requests_html import HTMLSession
session = HTMLSession()
r = session.get('https://www.winamax.fr/paris-sportifs/sports/1/1/1')
r.html.render(timeout=20)
print(r.html.html)
session.close()
Upvotes: 4