Hanno Fietz
Hanno Fietz

Reputation: 31380

Which type do I use to represent an arbitrary blob in Java?

I have an application that may receive data via various methods and in various formats. I have pluggable receivers that somehow acquire the data (e. g. by polling a mailbox, listening for HTTP requests, watch the content of a directory etc.), associate it with a MIME type and then pass it on wrapped like this:

public class Transmission {
    private String origin;      // where the data came from
    private String destination; // where the data was sent to
    private String mime;        // the MIME type of the data
    private BLOB data;          // this is what I need an appropriate type for
}

Further down the line, the data is processed by specialized handlers according to the value of the mime field. I'm expecting things like ZIP files, Excel documents, SOAP, generic XML, plain text and more. At this point, the code should be agnostic as to what's in the data. What is an appropriate type for the data field? Object? InputStream? Byte[]?

Upvotes: 3

Views: 3434

Answers (4)

R. Martinho Fernandes
R. Martinho Fernandes

Reputation: 234514

I would go with either byte[] or InputStream, preferring the stream since it is more flexible. You can use a ByteArrayInputStream to feed it an array of bytes, if need be. But you can't do it the other way around.

There is also the benefit of memory efficiency, since the stream can handle large chunks of external data without much memory. If you use byte[] you need to load all the data to memory. In other words, the stream is lazy.

Upvotes: 5

Brian Agnew
Brian Agnew

Reputation: 272317

In your above class I would make it a byte[]. Why not a java.sql.Blob ? So your Transmission object is SQL (or datastore) agnostic.

e.g. you may at some stage want to write it a Javaspace, CouchDb or something else that isn't a SQL database. By storing it as a byte array this info is in it's basic form, and you can translate it as you wish. If your byte[] is really sizable, then your Transmission object can handle caching via disk etc. But I would worry about that later.

EDIT: Reference to SQL made since an old answer (now deleted) recommended java.sql.Blob. Unfortunately once the answer disappears it makes the reference here somewhat anomalous.

Upvotes: 0

skaffman
skaffman

Reputation: 403501

Personally, i would use Spring's Resource abstraction. This provides a nicer wrapper around the idea of resource that exists somewhere. It provides methods to retrieve an InputStream for when you want to consume the resource.

The easiest implementation for you might be ByteArrayResource which encapsulates a byte[]. If that gets too big, then later you can switch to something like a FileSystemResource, or a URLResource, or one of the various other implementations provided by Spring. But since you always talk to the Resource interface, your client code shouldn't change too much.

Also, since this is just a set of utility classes and interfaces in the Spring API, you can use Resource and its implementations in isolation, without using anything else from Spring.

Upvotes: 0

dmeister
dmeister

Reputation: 35624

Multiple Possibilities:

  • byte[]
    • the most direct way
  • ByteBuffer
    • flexible
    • has random access and bulk operations
    • has operations for duplicating, slicing, etc
    • preferable if IO/Network intensive (NIO)
  • InputStream
    • allows pipelining if done right
    • has no support of random access or bulk operations.
    • Not as flexible as the ByteBuffer.

I would not use Blob, because putting DB-related stuff into our main model seems strange.

Upvotes: 7

Related Questions