Reputation: 70494
I'm working on a system that will need to store a lot of documents (PDFs, Word files etc.) I'm using Solr/Lucene to search for revelant information extracted from those documents but I also need a place to store the original files so that they can be opened/downloaded by the users.
I was thinking about several possibilities:
The storage I'm looking for should be:
Can you recommend what's the best way of storing those files will be in your opinion?
Upvotes: 3
Views: 2208
Reputation: 26582
You can follow Facebook as it stores a lot of files (15 billion photos):
Here is a facebook note if you want to learn more http://www.facebook.com/note.php?note_id=76191543919
Regarding the NFS share. Keep in mind that NFS shares usually limits amount of files in one folder for performance reasons. (This could be a bit counter intuitive if you assume that all recent file systems use b-trees to store their structure.) So if you are using comercial NFS shares like (NetApp) you will likely need to keep files in multiple folders.
You can do that if you have any kind of id for your files. Just divide it Ascii representation in to groups of few characters and make folder for each group. For example we use integers for ids so file with id 1234567891 is stored as storage/0012/3456/7891.
Hope that helps.
Upvotes: 1
Reputation: 20956
File System : While thinking about the big picture, The DBMS use the file system again. And the File system is dedicated for keeping the files, so you can see the optimizations (as LukeH mentioned)
Upvotes: 0
Reputation: 24545
In my opinion...
I would store files compressed onto disk (file system) and use a database to keep track of them.
and posibly use Sqlite if this is its only job.
Upvotes: 0
Reputation: 269658
A filesystem -- as the name suggests -- is designed and optimised to store large numbers of files in an efficient and scalable way.
Upvotes: 5