Reputation: 5836
I believe music streaming websites like youtube must be using sharded DB(based on some criteria like video category) for load distribution and high availability. But for such real time/streaming I think reading from io/DB is not going to work. Do they store all video in memory also (like in memcache clusters) or only popular videos are kept ?
On top of that Does CDN also cache all videos ? For example :- If New york users watched 1 million videos in a day, they will be cached in New york CDN server(say some constraint on upper size), if video is not found then only hit the webserver ?
Upvotes: 2
Views: 195
Reputation: 3839
Video files are very heavy content and apparently they are stored in file system or upon some software layers which work like file system. Another thing is how to store video metadata, user info, statistics and so. As a rule such kind of systems use different data sources and databases.
For instance video files and their thumbnails can be stored in file system. Meta information like file title, description, keywords, file locations and so on - in something like key-value storage (column-oriented databases). Such dbs are fast and they are easy to scale. YouTube video links have this format www.youtube.com/watch?v=<code>
where code is unique key for video. For search, statistics, billing and so on apparently they use another data sources which are more suitable for these purposes.
According to this post Youtube uses MySql database. But apparently they use it with many restrictions and tunning options in order to achieve good performance, availability and scalability.
CND is like a standard approach how to deal with high loaded geo distributed static traffic. Almost all big cloud providers like AWS, Microsoft Azure and so on provide CND as a service, you need to configure CND only. But of course you can build your own, there's a lot of solutions for this.
So, if you want to implement your own video hosting you need to consider how to store video and related info, how to deal with your traffic and so on. For simple solution it could be simple one relational or document database, file system + CDN.
Upvotes: 2