Reputation: 2569
I need to crawl websites and extract some information from dynamically created pages after a form submission.
The information which I need to crawl would mostly come from databases on these sites.
Added:
Crawlers usually work by jumping from one hyper-link to another. So these are mostly static pages. What about crawling pages that are not statically present but created on the fly.
Upvotes: 1
Views: 1026
Reputation: 100110
From crawler's point of view there's no big difference. You're still getting genrated HTML.
The only thing you need to be careful about is links leading to infinite number of pages, e.g. calendar that's dynamically generated and has links to next/previous month/year.
Upvotes: 1