TheOne
TheOne

Reputation: 11159

What is an efficient method for storing media files?

Say you are building the next great social app that will get tons of users. A problem that you might encounter is that you will need to host a lot of media files in a scalable fashion without limiting the performance of the entire site.

What might be some good ways of going about this problem?

A dedicated server just for the media files?
A cloud?

Upvotes: 2

Views: 644

Answers (1)

Elad
Elad

Reputation: 3130

Media files are different from parts of your system that contain application logic in that serving media files is an I/O intensive task, whereas app logic usually requires some combination of I/O and CPU (the exact balance is very app-dependent). This is why it indeed makes sense to use a dedicated media-serving system that is optimized for disk and network throughput.

Some general guidelines if you use your own dedicated server:

  • Invest in lots of RAM and use caching for your most commonly-consumed content. The idea is to save on disk access-time (RAM is roughly 100 times faster in theory). Memcache is the most popular solution nowadays afaik.
  • Invest in fast disk IO, install multiple disks and use RAID (striping) to improve throughput.
  • When selecting a hosting provider for your dedicated / co-lo server(s), focus on bandwidth.
  • If possible, you want to locate the files close to their consumers in order to improve network latencies. So for example media files in Brazilian Portuguese would benefit if stored on a server in South America.
  • A good CDN can solve practically all of the above. In my own experience, it reduced the load on our own servers by ~85%. We use Cotendo and Akamai. Other providers you can look at ottomh: CDNetworks, Limelight, Level3.

If you're just starting out then your best bet imo is using S3 to store your files, with CloudFront as your CDN. In my own experience, its a very simple solution to set up, and quite cost-effective when starting out - as costs grow linearly with the amount of data and usage. Beyond a certain threshold though it makes sense to start looking at managing your own dedicated storage racks and use some other CDN.

Upvotes: 2

Related Questions