Reputation: 60811
Is it possible to scrape all the text from a site that was navigated to by WebBrowser
control without looking at the source?
Upvotes: 2
Views: 15480
Reputation: 499132
You use the DocumentText
property or the WebBrowser control.
This property is what holds the HTML of the site you have navigated to.
Update: (following comments)
If you want to parse the HTML and get the text parts of it, I suggest you use the HTML Agility Pack.
Upvotes: 4
Reputation: 380
David Walker's method is great when one don't need any info from the header nor non main part of the webpage. if one need something outside inner text, there is only two options, one is to parse with "getElement". the other one is issue commands (Document.ExecCommand) to webbrowser to select all and copy to clipboard:
wb.Document.ExecCommand("SelectAll", false, null);
wb.Document.ExecCommand("Copy", false, null);
then finally string content=clipboard.getText();
Please note the spelling and syntax may not be correct, I'm recalling from my memory
Upvotes: 7