ThinkingInBits
ThinkingInBits

Reputation: 11472

What technologies should I be using to create high performing scalable web applications?

A little bit about my current situation:

I have a few specific questions I would like answered

Feel free to add anything else that you feel would help. My main goal is to learn the latest technologies to create high performing enterprise applications. Additionally, I'm curious as to how much of a performance increase I would notice by, say, upgrading my Amazon box. Andddd, for the questions:

  1. How does facebook return their search results so fast, and almost instantly while typing?

  2. How does facebook achieve their status updates above the chat window. I could easily hack something together that calls a back-end script every 5 seconds or something and updates the UI, but I'm not sure what type of performance issues I would run into or if this is even how facebook does it.

  3. How are the facebook status updates aggregated and related to friends only, and/or feed preferences.

  4. Is MySQL no longer the database of choice for speed and scalability?

  5. What resources and books should I be looking at and reading? I spend each day reading about the stuff I'm already using... but I want to better focus my energy on potentially something more useful.

  6. Generally, what 'stack' of technologies, including languages, servers, and databases would be used to create something like facebook (mind you, I have no desire to create a social networking site)

  7. Is there much of a performance hit by using a framework like symfony2 as opposed to writing a custom tailored solution? (I know quality of the code obviously matters, but generally speaking)

If you don't have an answer to all of these, numbers three, four and five are probably the most important.

Thanks in advance. Happy coding.

Upvotes: 0

Views: 378

Answers (1)

James Youngman
James Youngman

Reputation: 3733

Scalability is all about the location of the data, how it's retrieved and how it's updated. The implementation language is almost irrelevant.

If you have a single source of truth, it immediately becomes the bottleneck. That may not, yet, be so bad. If the bottleneck is 50,000 QPS, you will probably not need to fix it for a while.

You ask a lot about Facebook and then explain that you don't want to build one. Scaling a system is all about choosing a design that suits the data you are trying to serve. So unless you give us some ideas what you want to build, helping you design the scalling is quite hard.

As a trivial but specific example, the data storage designs for Google's websearch and GMail systems are totally, totally different. Both are pretty fast, but their designs are different because the data, its usage pattern, its updates and its characteristics are all very different.

To begin the data design process, start with an idea of what data you need. Then think about

  1. Global consistency - do all users need to see a consistent view of the data? If so, scaling is going to be very hard. (Think about Facebook, GMail, and Stackoverflow - in these cases, you and I don't need to see an instantaneously consistent view of the data).

  2. Durability - is it ever acceptable to lose updates? If no, you will need to persist all data (in enough different places that hardware loss is not an issue, remembering that you are not willing to lose updates) before telling the caller that the request is done.

  3. Performance - what are the user's performance needs?

In most systems, you can only design to get two of those three things, and you have to sacrifice the third one to do so.

Draw a diagram of your design. Point to each box on it (a box would be a computer, a router, a database instance, a disk, an in-memory data structure, etc., but not a table or a database row). Ask, "how many of these can we have, maximum?" If the answer is "1", then your design is not scalable. If the answer is "as many as you like, but they need to be synchronised", that's going to be your scaling challenge; take a look again at the numbered points above.

Upvotes: 2

Related Questions