0x4a6f4672
0x4a6f4672

Reputation: 28245

How do you partition functionality into separate services

What is the best way to split a massive web application built with a standard web framework like Ruby on Rails or Django into small pieces and spread them across a large array of servers? If we consider a partitioning in RESTful services and a service-oriented design or architecture, then one could use one of the methods Paul Dix names in his book "Service-Oriented Design with Ruby on Rails":

Is it preferrable to partition on logical function and business logic, to partition on Read/Write frequencies, or to partition on minimizing joins and database accesses? Another possible choice are different content types: IDs, (social) graphs, maps, files, images, etc. It is common for example to store images at Amazon S3 or to get maps using Google Maps. What are the best practices?

Upvotes: 1

Views: 886

Answers (1)

0x4a6f4672
0x4a6f4672

Reputation: 28245

Maybe it is worth to take a look at the internet giants. Amazon and eBay are known for a service oriented approach. These are the internet giants which partition everything into services.

ebay: Randy Shoup explains a number of best practices at eBay to build large-scale Websites, for example in this presentation about eBay's Architectural Principles and the corresponding article about lessons from eBay. Ebay partitions everything. Every problem is split into manageable chunks in multiple dimensions, by data, load, and/or usage pattern. The two basic partition patterns are (1) functional segmentation and (2) horizontal split, both database and application tier are first segmented by functionality, and second split horizontally. Randy says functional segmentation and functional decomposition is the most important method, related pieces of functionality belong together, while unrelated pieces of functionality belong apart. Paul Dix says the same in his book: "Generally, you want to partition services based on their logical function". ebay's architecture has about 200 groups of functionality aka "apps". The application tier which runs on 16,000 application servers is divided in ebay's architecture into 220 separate application pools (Selling, Searching, Viewing Items, Bidding, Checkout, ..). The ebay database tier has over a thousand different logical databases on 400 hosts, where the databases are segmented into functional areas. ebay has written their own ORM layer called Data Access Layer (DAL) which takes care of the the database splits.

Amazon: At Amazon, everything is divided in services. Service-oriented architecture (SOA) is the fundamental building abstraction for Amazon technologies. The Amazon.com architecture is not only divided in to services, even the developers at Amazon are organized in teams around services. Amazon is really an ecosystem of many internal start-ups which have their own data and their own API. A service is here something which is operated and owned by a small team of developers. The Amazon.com platform is made of hundreds of services, from primitive, low-level foundation services (Storage, Compute, Queuing, ..) to aggregated, high-level services like Identity Management, Content Generation & Discovery Product and Offers Management, Order Processing, Payments or Fulfillment & Customer Service. To construct a product detail page for a customer visiting Amazon.com, the software calls on between 200 and 300 services to present a highly personalized experience for that customer.

Twitter uses services which correspond to the different content types, IDs, graphs, URLs etc. It uses Snowflake for ID generation. Snowflake is the network service for generating unique ID numbers at high scale used. Twitter uses FlockDB as social graph storage. FlockDB is a distributed graph database for storing adjancency lists used by Twitter. It uses SpiderDuck to as URL fetcher. SpiderDuck fetches all URLs shared in Tweets in real-time, parses the downloaded content to extract metadata of interest and makes that metadata available for other Twitter services to consume within seconds.

Upvotes: 2

Related Questions