Slim Shady
Slim Shady

Reputation: 1085

Multiple Small Directories Or One Huge Directory with file naming php mysql

This is a completely theoretical question.

I have a photo storage site in which photos are uploaded by users registered in the website.

The Question

Now i have thought of two approaches of accomplishing that stuff.

Files uploaded to my server is expected to be huge ~>100 million

Approach 1

These two /pictures/hd/ & /pictures/low/ directories will contain all the files uploaded by the user.

$newfilename  =  $user_id.time().$filename; //$filename = actual filename of uploaded file
$src = '/pictures/hd/'.$newfilename; //for hd pics

Inserting that into mysql by

insert into pics(`user_id`,`src`)VALUES('$user_id','$newfilename')

Approach 2

These two /pictures/hd/ & /pictures/low/ directories will contain sub-directories of the files uploaded by the user.

This is going to create lots of subdirectories with the name as user_id of the user who is uploading the file into the server.

if (!is_dir('/pictures/hd/'.$user_id.'/')) {
   mkdir('/pictures/hd/'.$user_id.'/');         
 }
$newfilename  =  $user_id.'/'.$user_id.time().$filename; //$filename = actual filename of uploaded file
$src = '/pictures/hd/'.$newfilename; //for hd pics

Inserting that into mysql by

insert into pics(`user_id`,`src`)VALUES('$user_id','$newfilename')

Retrieval

When retrieving the image i can use the src column of my pics table to get the filename and explore the hd file using the '/pictures/hd/'.$src_of_picstable and lowq files using '/pictures/low/'.$src_of_picstable

Upvotes: 0

Views: 633

Answers (2)

symcbean
symcbean

Reputation: 48387

The right way to answer the question is to test it.

Which is faster will depend on the number of files and the underlyng filesystem; ext3,4 will quite happily cope with very large numbers of files in a single directory (dentries atr managed in an HTree index). Some filesystems just use simple lists. Others have different ways of optimizing file access.

Your first problem of scaling will be how to manage the file set across multiple disks. Just extending a single filesystem across lots of disks is a bad idea. If you have lots of directories, then you can have lots of mount points. But this doesn't work all that well when you get to terrabytes of data.

However that the content is indexed independently of the file storage means that it doesn't matter what you choose now for your file storage, because you can easily change the mapping of files to location later without having to move your existing dataset around.

Upvotes: 1

insanebits
insanebits

Reputation: 838

I wouldn't suggest single directory approach for two reasons. Firstly, if you're planning to have a lot of images your directory will get really big. And searching for a single image manually will take a lot longer. This will be needed when you debug something ir test new features.

Second reason for multiple directories is that you can smaller backups of part of your gallery. And if you have really big gallery(lets say several terabytes) single hard drive might not be enough to contain them all. With multiple directories you can mount each directory on separate hard drive and this way handle almost infinite size gallery.

My favorite approach is YYYY/MM/type-of-image directory structure. This way you can spot when did you introduce some bug by looking month by month. Also you can make monthly backups without duplicating redundant files. Also making quarterly snapshots of all gallery just in case.

Also about type-of-image there are several types of images that I might need such as original image, small thumbnail, thumbnail, normal image and etc. This way i can just swap type of image and get different image size.

As for you I would suggest YYYY/MM/type-of-image/user_id approach where you could easily find all user uploaded files in one place.

Upvotes: 0

Related Questions