ROBOTPWNS
ROBOTPWNS

Reputation: 4419

Wget source code for resulting webpage after querying?

I'm trying to count the number of times a searchbox errors when I do a bulk input of test data on a website. So I'm trying to wget the query result and seeing whether there is the word "Error" in the html result page. I am trying to download the resulting html webpage after I submit a query to a website. I build the query and used wget to download the resulting webpage.

However, only the main content of the html is shown and not the result because it was done by using an external javascript file. The html that I want can only be seen if I right click on View Page Source on my browser. Is there a non-manual way to use wget/curl to download such page source instead of having to click through all of them?

Upvotes: 1

Views: 203

Answers (1)

user3617878
user3617878

Reputation:

The javascript is a program, and the result of a program isn't deterministic in polynomial time (for arbitrary input). Thus, it's easier to load the javascript in a sandbox environment, and then execute it against test cases.

Wget and curl can't do that: they don't have any features to examine/execute the result of their fetch. Practically speaking, what you need is a browser that can efficiently load and test the script, as wget/curl from the shell. Luckily, there is already a such thing: Selenium. It is a firefox/chrome/explorer extension, which makes a running an instance of those browsers scriptable, and easily controlled remotely.

If you want to run these browsers noninteractively, without a gui, I suggest using a fake (hardware-less) X server.

Google for: selenium, and google for: headless X. Good luck!

Upvotes: 1

Related Questions