Reputation: 75
I need to scrape a web page for all the links so i can visit them later to find and record where cookies are set. This is for the new uk leglislation that requires users to be in formed of cookies that are set, ive decided to try to automate some of this process to save some time.
My problem is that my companies sites use a lot of javascript to render the pages and content, this means that when i retreive the pages (using html agility pack at the minute) they mainly contain a lot of javascript and are missing a lot of the links which show when fully rendered. Im hosting this as an asp application on one domain and pass in urls to scrape and visit all the links on the sites pages.
Is there a way i can excecute the javascript so the pages are rendered and i can get all links?
Upvotes: 1
Views: 1416
Reputation: 902
You could make a Windows Form with a Web Browser control. You can set the URL and set a callback event when the page is loaded. It will render the page, including javascript, then you can access the DOM (I think through WebBrowser.Document).
Upvotes: 2
Reputation: 5681
I don't understand your problem. When it is your company website you don't need to scrape the page. You already have the code. Just look at your codebase and see if cookies are created and what is stored inside.
Upvotes: 0