Reputation: 4037
I have a servlet with an API that delivers images from GET requests. The servlet creates a data file of CAD commands based on the parameters of the GET request. This data file is then delivered to an image parser, which creates an image on the file system. The servlet reads the image and returns the bytes on the response.
All of the IO and the calling of the image parser program can be very taxing and images of around 80kb are rendering in 3-4000ms on a local system.
There are roughly 20 parameters that make up the GET request. Each correlates to a different portion of the image. So, the combinations of possible images is extremely large.
To alleviate the loading time, I plan to store BLOBs of rendered images in a database. If a GET request matches one previously executed, I will pull from cache. Else, I will render a new one. This does not fix "first-time" run, but will help "n+1 runs".
Any other ideas on how I can improve performance?
Upvotes: 3
Views: 892
Reputation: 62652
You can turn all the parameters that you feed into the rendering pipeline into a single String in a predictable way such that you can compute a SHA1 hash of the input then store the output file in a directory with the SHA1 as the file name, that way if you get a request with the same parameters you just compute the hash then check if the file is on disk if it is return it otherwise send the work to the render pipeline and create the file.
If you have a lot of files you might want to use more than one directory, maybe look at how git divides up files across directories by the first few chars of the SHA1 for inspiration.
I use a similar setup on my app I am not doing rendering just storing files, the files are stored in the db but for performance reasons I serve them out from disk using the sha1 hash of the file contents as the filename / URI for the file.
Upvotes: 1
Reputation: 3191
you can store file on you disk,and image path in database,because database storage is usually more expensive than file system storage.
sort the http get parameters and hash them as an index to that image record for fast query by parameters.
to make sure your program not crush when disk capacity not enough,you should remove the the unused or rarely used record:
store a lastAccessedTime for each record,updated each time when the image is requested.
using a scheduler to check lastAccessedTime,removing records which is lower than a specified weight. you can use different strategy to calculate the weight,such as lastAccessedTime,accessedCount,image size,etc.
Upvotes: 2