Reputation: 138
i want to get the links using c# console from a website using html agility pack but there is java script code written in li and href tag why java script changes code on click i don't know please tell me the solution how t get actual code
<li onmouseover="activate_menu('top-menu-61', 61); void(0);" onmouseout="deactivate_menu('top-menu-61', 61);"><a href="javascript:void();
i can just see this in my li and a tag,how to resolve this and get actual html so i can get links furthur
Upvotes: 0
Views: 1058
Reputation: 89285
Try using browser automation tools like Selenium WebDriver to generate a webpage fully, utilizing a real browser, before passing it to HtmlAgilityPack for parsing. Using Selenium should be fairly easy as exemplified below. You only need to make sure that all the needed tools (Selenium library and browser driver of choice) are installed properly beforehand :
// Initialize the Chrome Driver (or any other supported browser)
using (var driver = new ChromeDriver())
{
// open the target page
driver.Navigate().GoToUrl("the_targt_page_url_here");
//maybe add selenium waits if needed,
//to wait until certain element appear in the page
//pass the HTML page to HAP's HtmlDocument
HtmlDocument doc = new HtmlDocument();
doc.LoadHtml(driver.PageSource);
}
Selenium also provides ways to locate elements within a page, so it is possible to replace HAP completely with Selenium, if you want.
Upvotes: 1