is there a straightforward way to retrieve text that is rendered by the browser but is not hard-coded in the actual html file?

Question

I'm trying to retrieve data from a webpage but I cannot do it by making a web request and parsing the resulting html file because the actual text that I'm trying to retrieve is not in the html file! I imagine that this text is pulled using some script and for that reason it's not in the html file. For all I know I'm looking at the wrong data, but assuming that my theory is correct, is there a straightforward way to retrieve whatever text is displayed by the browser (Firefox or IE) rather than attempt to fetch the text from the html file?

cowls · Accepted Answer

Assuming you are referring to text that has been generated using Javascript in the browser.

You can use PhantomJS to achieve this: http://phantomjs.org/

It is essentially a headless browser that will process Javascript.

You may need to run this as ane xternal program but Im sure you can do that through C#

is there a straightforward way to retrieve text that is rendered by the browser but is not hard-coded in the actual html file?

Answers (2)

Related Questions