user1257144
user1257144

Reputation: 210

How can I get the HTML generated with javascript?

I want to get the HTML content of a web page but most of the content is generated by javascript.

Is it posible to get this generated HTML (with python if posible)?

Upvotes: 3

Views: 2750

Answers (3)

Kien Truong
Kien Truong

Reputation: 11381

Checkout Selenium. It have a python driver, which might be what you're looking for.

Upvotes: 2

MrCode
MrCode

Reputation: 64536

If most of the content is generated by Javascript then the Javascript may be doing ajax calls to retrieve the content. You may be able to call those server side scripts from your Python app.

Do check that it doesn't violate the website's terms though and get permission.

Upvotes: 0

jfriend00
jfriend00

Reputation: 708126

The only way I know of to do this from your server is to run the page in an actual browser engine that will parse the HTML, build the normal DOM environment, run the javascript in the page and then reach into that DOM engine and get the innerHTML from the body tag.

This could be done by firing up Chrome with the appropriate URL from Python and then using a Chrome plugin to fetch the dynamically generated HTML after the page was done initializing itself and communicate back to your Python.

Upvotes: 3

Related Questions