ItsLockedOut
ItsLockedOut

Reputation: 239

Which technology to use for running crawler and updating database in asp.net website?

I am developing a project for college and i need some suggestions over the development. Its a website which shows information from other websites like Links, Images etc.

I have prepared below given model for the website.

A Home.aspx page which shows data from tables (sql server).

I have coded a crawler (in c#) which can crawl (fetch data) required website data.

I want some way through which i can run the crawler at back end for some time interval and it can insert updates in the tables. I want that i can get updated information in my database so that Home.aspx shows updated info. (Its like smaller version of Google News website)

I want to host the wesbite in Shared Hosted Environment (i.e a 3rd party hosting provider company and that may use IIS platform)

I posted simliar situation to different .NET forums and communities and they suggested lot of different things such as

Please suggest me a clear way out so that i complete the task. Please suggest elobrated technology and methods which suits my project.

Waiting...

Thanks...

Upvotes: 0

Views: 726

Answers (2)

alex.net
alex.net

Reputation: 286

Your shared host constraint really impacts on technologies restrictions.

In theory, the best way to host your crawler would have been a Windows service, since you can take advantage of windows services configuration. A service is always up, can be automatically started at startup, writes errors in event log, can be automatically restarted after failure...

Then, you Home.aspx would have been a regular website in IIS.

If you really stay on a shared host (where you cannot setup a service), I would have make the crawler as a module which is run on your application startup.

Problem is, IIS application pool doesnt live forever if your web site is not in use, and it may stop the crawler. It is configurable, but I dont know how much in a shared host.

In IIS 7.5, think about starting your module at application warm up

Finally if you need to run the crawler at interval times (like every day at midnight), if your shared host does not let you set task scheduling, think about Quartz Framework, which allow you perform task scheduling inside your application (without the intervention of the OS)

Upvotes: 3

Craig Stewart
Craig Stewart

Reputation: 1501

  • Integrate your crawler code into a aspx page
  • Setup a task scheduler on your host to call that page every X minutes
  • When the page is called check that localhost has called the page
    • If localhost called it run the crawl routine and
    • If localhost hasn't called it throw a 404 eror

Upvotes: 2

Related Questions