Reputation: 3137
I've been looking for a PaaS provider for some time, nodejitsu seemed promising but doesn't offer some of the features I'm looking for. I need the ability to process a lot of data quickly for a lot of my requests. I'm off to a good start with node.js but what I'd like to do is fire off tasks to scrape web data, process some statistics(basically a roster) from databased information.
Basically I'm scraping peoples social media(Facebook, twitter, tumblr, etc.) to determine how much presentation they get on my web service, then serve their latest content(image and a short text) to the viewers. In the end this creates a very large amount of operations per request because I need to compare statistics along many different artists.
What I imagine doing is something like this:
This is the structure I desire to deploy on heroku, so I can use the processing dynos to free up web dynos so users are never waiting in the dark for a page to load. On high traffic some users may have to wait for the page to populate content, but in most cases the content will start populating soon after the page is rendered. If not the users who just intend to navigate to another page right away aren't stuck waiting for the site to finish responding to do so.
So basically my question is how do I leverage worker dynos to free up web dynos in node? Or is there a better way to do this?
Sorry for any sloppiness, this was type on my tablet.
Upvotes: 0
Views: 112
Reputation: 34337
Yes, Heroku is great for this sort of thing. See https://devcenter.heroku.com/articles/background-jobs-queueing
The missing component in your thinking is the use of a queue. Resque with coffee-resque is probably the most widely used, but Kue is a great option for an all-Node solution. Both run on top of Redis.
Upvotes: 1