Reputation: 13
I have a web server that is dynamically creating various reports in several formats (pdf and doc files). The files require a fair amount of CPU to generate, and it is fairly common to have situations where two people are creating the same report with the same input.
Inputs:
When a user attempts to generate a report, I would like to check to see if a file already exists with the given input, and if so return a link to the file. If the file doesn't already exist, then I would like to generate it as needed.
What solutions are already out there? I've cached simple http requests before, but the keys were extremely simple (usually database id's)
If I have to do this myself, what is the best way. The input can be several hundred words, and I was wondering how I should go about transforming the strings into keys sent to the cache.
//entire input, uses too much memory, one to one mapping cache['one two three four five six seven eight nine ten eleven...'] //short keys cache['one two'] => 5 results, then I must narrow these down even more
Is this something that should be done in a database, or is it better done within the web app code (python in my case)
Thanks you everyone.
Upvotes: 1
Views: 158
Reputation: 391952
This is what Apache is for.
Create a directory that will have the reports.
Configure Apache to serve files from that directory.
If the report exists, redirect to a URL that Apache will serve.
Otherwise, the report doesn't exist, so create it. Then redirect to a URL that Apache will serve.
There's no "hashing". You have a key ("a string (equations, numbers, and lists of words), arbitrary length, almost 99% will be less than about 200 words") and a value, which is a file. Don't waste time on a hash. You just have a long key.
You can compress this key somewhat by making a "slug" out of it: remove punctuation, replace spaces with _, that kind of thing.
You should create an internal surrogate key which is a simple integer.
You're simply translating a long key to a "report" which either exists as a file or will be created as a file.
Upvotes: 2
Reputation: 304375
The usual thing is to use a reverse proxy like Squid or Varnish
Upvotes: 1