Reputation: 239
I am developing a project for college and i need some suggestions over the development. Its a website which shows information from other websites like Links, Images etc.
I have prepared below given model for the website.
A Home.aspx page which shows data from tables (sql server).
I have coded a crawler (in c#) which can crawl (fetch data) required website data.
I want some way through which i can run the crawler at back end for some time interval and it can insert updates in the tables. I want that i can get updated information in my database so that Home.aspx shows updated info. (Its like smaller version of Google News website)
I want to host the wesbite in Shared Hosted Environment (i.e a 3rd party hosting provider company and that may use IIS platform)
I posted simliar situation to different .NET forums and communities and they suggested lot of different things such as
Create a web service (is it really necessary ?)
Use WCF
Create a Console application and run windows task sheduler (is it okay with asp.net (win forms website) and in shared hosted)
Run crawler on local machine and update database accordingly. (No i want everything online) etc etc
Please suggest me a clear way out so that i complete the task. Please suggest elobrated technology and methods which suits my project.
Waiting...
Thanks...
Upvotes: 0
Views: 726
Reputation: 286
Your shared host constraint really impacts on technologies restrictions.
In theory, the best way to host your crawler would have been a Windows service, since you can take advantage of windows services configuration. A service is always up, can be automatically started at startup, writes errors in event log, can be automatically restarted after failure...
Then, you Home.aspx would have been a regular website in IIS.
If you really stay on a shared host (where you cannot setup a service), I would have make the crawler as a module which is run on your application startup.
Problem is, IIS application pool doesnt live forever if your web site is not in use, and it may stop the crawler. It is configurable, but I dont know how much in a shared host.
In IIS 7.5, think about starting your module at application warm up
Finally if you need to run the crawler at interval times (like every day at midnight), if your shared host does not let you set task scheduling, think about Quartz Framework, which allow you perform task scheduling inside your application (without the intervention of the OS)
Upvotes: 3
Reputation: 1501
Upvotes: 2