Raffaele
Raffaele

Reputation: 511

Http Proxy servlet with caching capabilities

I'm writing a very simple proxy servlet by taking as a reference this one. I want o add caching capabilities where the key in the cache would be the URI.

Of course the issue is that I cannot cache the whole response since if I pass it throught the pipeline the input stream will be consumed and then the cached one is no more available.

What do u think is the best way to approach this? How can I copy an HTTPResponse (or just the HTTPEntity) withouth consuming it's content?

Upvotes: 0

Views: 215

Answers (1)

Emre Isik
Emre Isik

Reputation: 746

An InputStream, unless otherwise stated, is single shot: you consume it once and that's it.

If you want to read it many times, that isn't just a stream any more, it's a stream with a buffer. To cache the input stream you should write the response content into a file or into the memory, so you can re-read it again (multiple times).

The HTTPEntity can be re-readable but it depends on the type of the implementation. You can check this with .isRepeatable() for example. This is the original javadoc of apache.

streamed: The content is received from a stream, or generated on the fly. In particular, this category includes entities being received from a connection. Streamed entities are generally not repeatable.
self-contained: The content is in memory or obtained by means that are independent from a connection or other entity. Self-contained entities are generally repeatable.
wrapping: The content is obtained from another entity.

You could use the FileEntity which is self-contained and therefore repeatable (re-readable).

To archive this (cache into a file), you can read the content of HTTPEntity and write it into a File. After that you can create a FileEntity with the File, we created and wrote before. Finally you just need to replace the HTTPResponse's entity with the new FileEntity.

Here is a simple example without context:

// Get the untouched entity from the HTTPResponse
HttpEntity originalEntity = response.getEntity();

// Obtain the content type of the response.
String contentType = originalEntity.getContentType().getElements()[0].getValue();

// Create a file for the cache. You should hash the the URL and pass it as the filename.
File targetFile = new File("/some/cache/folder/{--- HERE the URL in HASHED form ---}");

// Copy the input stream into the file above.
FileUtils.copyInputStreamToFile(originalEntity.getContent(), targetFile);

// Create a new Entity, pass the file and the replace the HTTPResponse's entity with it.
HttpEntity newEntity = new FileEntity(targetFile, ContentType.getByMimeType(contentType));
response.setEntity(newEntity);

Now you can re-read the content from the file again and again in the future.
You just need to find the file based on the URI :)

To cache in-memory you could use the ByteArrayEntity.

This method just caches the body. Not the http headers.

Update: Alternative

Or you could use Apache HttpClient Cache.
https://hc.apache.org/httpcomponents-client-ga/tutorial/html/caching.html

Upvotes: 1

Related Questions