Reputation: 13
Little bit of background information: The previous project manager was fired due to not delivering the project on time. I have little experiencing coding, but am now leading the team to finish the website.
The website itself is similar with Ebay where an item is added for sale. Images and documents will are associated with the item, but hosted in folders that are created when the image is uploaded. The dev team has asked me "how to manage the folders with the documents in relation to the item listing". There will be between 1-10 images/documents uploaded per item and will be between 1000-2000 items listed at one point in time (if not more).
From looking around, I believe the easiest solution is to name the folder by the item number and list the reference in MySql. Each item will have an individual item number and should be no duplicates. Are there better solutions for the folder management?
Upvotes: 1
Views: 118
Reputation: 522510
What you want to be careful with is that most filesystems have a limit on how many items can be stored in a folder; on Linux the limit is typically around 30000. With the numbers you give there should be little concern there, but you should still plan for the system to be future proof.
I have found it to be quite useful to store images by their hash. For instance, create a SHA1 hash of the image, e.g.: cce7190663c547d026a6bf8fc8d2f40b3b1b9ea5
. Then store the image in a directory structure based on this hash with a few levels of folders:
cce/719/066/3c5/cce7190663c547d026a6bf8fc8d2f40b3b1b9ea5.jpg
This uses the first 12 characters of the hash to form a folder structure 4 levels deep, then the file name is the entire hash. Increase or decrease the folder depth as necessary. This allows you to store quite a lot of images (((16^3)^4) * limit) without hitting the filesystem limits. You then save this path in a database with other information about the image and which items it belongs to. This method also effectively de-duplicates your data storage, you'll never store the same image twice.
Upvotes: 1
Reputation: 598
As mister said images could be renamed with the productid-docid-imageid-timestamp If the images are not retrieved very often storing the images in db as blob and printing the image with different name may help.
Upvotes: 2
Reputation: 16362
It used to be that filesystem performance would deteriorate if there were too many files in a directory, so the common wisdom was to limit to ~1,000 items in any directory.
Try creating a directory structure around the item_id (padded), so #1002003 might be 001002003, which could be found in 001/002/001002003.jpg.
Since you're storing more than one image per item, you might have one more level, e.g. 001/002/003/001002003_1.jpg.
Use the full ID as the item's name in the final directory (001002003.jpg, not 003.jpg). It'll come in handy later.
Hope that helps.
Upvotes: 0