Alex Martian
Alex Martian

Reputation: 3812

Python read http: html page as seen in browser - with javascripts results

Can I get http page as seen in browser - fully formed after javascripts are run? No need to submit data and press buttons. IMHO it's standard task, where can I see example to run all scripts and get result?

Via:

u = urllib.request.urlopen('https://www.*')
data = u.read()

I get page as seen if I choose view source in browser. However, when I inspect elements on page, I see how code expands, e.g.:

<div class="js-events-container"></div>

expands to:

<div class="js-events-container">    <table class="zebra noBorderTbl" style="width: 100%;">
        <tbody><tr>
            <th>1</th>
            <th>2</th>
            <th>3</th>
        </tr>
...
        </tr>
            </tbody></table>
</div>

Upvotes: 0

Views: 54

Answers (3)

Alex Martian
Alex Martian

Reputation: 3812

I now load page using selenium, then get page_source. In spite of name, page_source give not page source, but page result after java scripts are run.

Upvotes: 0

Rajesh Yogeshwar
Rajesh Yogeshwar

Reputation: 2179

You can also give a look to this particular library dryscape. It is javascript aware as per documentation.

Upvotes: 0

bertus wisman
bertus wisman

Reputation: 51

i see js in the class name, it is probably javascript, i think there isn't a way to get the full page with urllib. you need to pull the site after the javascript starts. you will need selenium or phantomjs to do the job.

Upvotes: 1

Related Questions