Nicolai
Nicolai

Reputation: 2915

Best way to create disc cache for web service

I have created a webservice that delivers images. It will always be one-way communication. The images will never be changed, on the side that gets them from the service.

It has multiple sources, and some can be far away, on bad connections. I have created a memory cache for it, but I would like to also have a disc cache, to store images for longer periods. I am a bit unsure on the best approach to do this.

First of all, all of my sources are webservers, so I don't really know how to check the last modified date (as an example) of my images, which I would like to use, to see if the file has changed.

Second, how do I best store my local cache? Just drop the files in a folder and compare dates with the original source? Or, perhaps store all the timestamps in a txt file, with all the images, to avoid checking files. OR, maybe store them in a local SQL express DB?

The images, in general, are not very large. Most are around 200kb. Every now and then, however, there will be 7+ mb. The big problem is, that some of the locations, where the service will be hosted, are on really bad connections, and they will need to use the same image, many times. There are no real performance requirements, I just want to make it as responsive as possible, for the locations that have a horrible connection, to our central servers.

I can't install any "real" cache systems. It has to be something I can handle in my code.

Upvotes: 2

Views: 375

Answers (1)

zmbq
zmbq

Reputation: 39013

Why don't you install a proxy server on your server, and access all the remote web-servers through that? The proxy server will take care of caching for you.

EDIT: Since you can't install anything and don't have a database available, I'm afraid you're stuck with implementing the disk cache yourself.

The good news is - it's relatively easy. You need to pick a folder and place your image files there. And you need a unique mapping between your image identification and a file name. If your image IDs are numbers, the mapping is very simple...

When you receive a request for an image, first check for it on the disk. If it's there, you have it already. If not , download it from the remote server and store it there, then serve it from there.

You'll need to take concurrent requests into account. Make sure writing the files to disk is a relatively brief process (you can write them once you finish downloading them). When you write the file to disk, make sure nobody can open it for reading, that way you avoid sending incomplete files.

Now you just need to handle the case where the file isn't in your cache, and two requests for it are received at once. If performance isn't a real issue, just download it twice.

Upvotes: 2

Related Questions