Reputation: 9441
Im new to Cassandra, so bear with me.
So, I am building a search engine using Cassandra as the db. I am interacting with it through Pycassa.
Now, I want to output Cassandra's response to a webpage, having the user submitted a query.
I am aware of tools such as django, fastCGI, SCGI, etc to allow python to web interaction. However, how does one run a python script on a webserver without turning this server into a single point of failure ( i.e., if this server dies than the system is not accessible by the user) - and therefore negating one purpose of Cassandra?
Upvotes: 1
Views: 472
Reputation: 19377
Theo Schlossnagle's Scalable Internet Architectures covers load balancing and a lot more. Highly recommended.
Upvotes: 0
Reputation: 2866
I've seen this problem before - sometimes people need much more CPU power and bandwidth to generate and serve some server generated HTML and images than they do to run the actual query in Cassandra. For one customer, this was many 10's times the number of servers serving the front end of the website than in their Cassandra cluster.
You'll need to load balance between these front end servers somehow - investigate running haproxy on a few dedicated machines. Its quick and easy to configure, and similarly easy to reconfigure when your setup changes (unlike DNS, which can take days to propagate changes). I think you can also configure nginx to do the same. If you keep per-session information in your front end servers, you'll need each client to go to the same front end server for each request - this is called "session persistence", and can be achieved by hashing the client's IP to pick the front end server. Haproxy will do this for you.
However this approach will again create a SPOF in your configuration (the haproxy server) - you should run more than one, and potentially have a hot standby. Finally you will need to somehow balance load between your haproxies - we typically use round robin DNS for this, as the nodes running haproxy seldom change.
The benefit of this system is that you can easily scale up (and down) the number of front end servers without changing your DNS. You can read (a little bit) more about the setup I'm referring to at: http://www.acunu.com/blogs/andy-ormsby/using-cassandra-acunu-power-britains-got-talent/
Upvotes: 3