Reputation: 12680
I intend to write structured data to a resource accessed via an HTTP client. APIs for doing this (for JSON, YAML, XML) tend to make me pass them an OutputStream
which they will write to - they don't give me an InputStream
.
For better or worse, the Apache HTTP Components HttpClient is the client being used here. (Other libraries we use depend on it. It's not entirely bad for the most part and at least doesn't force us to employ weird thread-local hacks just to get sane behaviour, unlike java.net.URL
.)
When making a request, HttpEntityEnclosingRequestBase
(in HttpClient) forces me to set an HttpEntity
to get any data to the server. HttpEntity
seemingly forces me to implement getContent()
, returning an InputStream
.
I don't have an InputStream
, so I am forced to choose between two workarounds:
A) Serialise all the data into an in-memory byte array and then stream it all back out again. I don't want to do this, because usually, the serialised form of the data takes up a lot more memory than the data itself, and in some cases we don't even have it in memory in the first place, so this would be asking for trouble.
B) Create a Pipe
. Spin up a second thread to write the object to the OutputStream
end of the pipe. Return the InputStream
end. This can't actually be done in HttpEntity
itself, because HttpEntity
has no idea when the data stream is no longer needed. (It could make an educated guess that it's done when you reach the end of the stream, but if the connection to the server dropped half way, you'd leave the pipe open forever.) This means I end up moving the workaround to every place where a connection is made, which is a lot of structural duplication.
Neither of these workarounds is great, but I guess (B) is "less shit" because it at least won't crash the entire application when a large object is transferred.
Here's as far as I got:
public class WriteLogicEntity extends AbstractHttpEntity {
private final WriteLogic writeLogic;
public InputStreamEntity(WriteLogic writeLogic) {
this(instream, null);
}
public InputStreamEntity(WriteLogic writeLogic,
ContentType contentType) {
this.writeLogic = writeLogic;
if(contentType != null) {
this.setContentType(contentType.toString());
}
}
@Override
public boolean isRepeatable() {
// We could enforce that all WriteLogic be repeatable
// or add a similar method there, but at least for now,
// assuming it isn't repeatable is safe.
return false;
}
@Override
public long getContentLength() {
// We really don't know.
return -1L;
}
@Override
public InputStream getContent() throws IOException {
//TODO: What do we do here?
}
@Override
public void writeTo(OutputStream outstream) throws IOException {
writeLogic.withOutputStream(outstream);
}
@Override
public boolean isStreaming() {
return true; //TODO: Verify this choice
}
}
public interface WriteLogic {
void withOutputStream(OutputStream stream) throws IOException;
}
Now I'm wondering if getContent()
can just throw UnsupportedOperationException. Surely when making a request, they would be using writeTo()
anyway, right? Well, I can't figure it out. Even if it works in one experiment, that wouldn't assure me that it is impossible for some kind of request to demand a call to getContent()
.
So I'm wondering if anyone who knows this library better than me can make a call on it - is it safe to skip implementing this method?
(This getContent()
method just seems like it shouldn't be in the API. Or it should be documented to at least allow me some way out of implementing it. I intend to file a bug about it anyway, because it's extremely inconvenient to be forced to provide an InputStream
when you are trying to write a request.)
Upvotes: 1
Views: 578
Reputation: 27538
If entity content cannot be represented as an InputStream getContent
method can throw UnsupportedOperationException. Internally HttpClient uses writeTo
to stream out entity content to the underlying HTTP connection.
Upvotes: 1