Rn2dy
Rn2dy

Reputation: 4190

How to get raw page source (not generated source) from c#

The goal is to get the raw source of the page, I mean do not run the scripts or let the browsers format the page at all. for example: suppose the source is <table><tr></table> after the response, I don't want get <table><tbody><tr></tr></tbody></table>, how to do this via c# code?

More info: for example, type "view-source:http://feeds.gawker.com/kotaku/full" in the browser's address bar will give u a xml file, but if you just call "http://feeds.gawker.com/kotaku/full" it will render a html page, what I want is the xml file. hope this is clear.

Upvotes: 0

Views: 843

Answers (3)

AakashM
AakashM

Reputation: 63378

You can use a tool such as Fiddler to see what is actually being sent over the wire.

disclaimer: I think Fiddler is amazing

Upvotes: 0

TheCodeKing
TheCodeKing

Reputation: 19240

If you mean when rendering your own page. You can get access the the raw page content using a ResponseFilter, or by overriding page render. I would question your motives for doing this though.

Scripts run client-side, so it has no bearing on any c# code.

Upvotes: 0

spender
spender

Reputation: 120538

Here's one way, but it's not really clear what you actually want.

using(var wc = new WebClient())
{
    var source = wc.DownloadString("http://google.com");
}

Upvotes: 1

Related Questions