David L
David L

Reputation: 44668

Simple properties to string conversion in Java

Using Java, I need to encode a Map<String, String> of name value pairs to store into a String, and be able to decode it again. These will be stored in a database column, and will probably usually be short and simple, so the common case should produce a simple nice looking line, but shouldn't corrupt the data, even if it contains unexpected characters, etc.

How would you choose to do it such that:

Url encoding? JSON? Do it yourself? Please specify any helper libraries or methods you'd use.

(Edited to specify more context and requirements as requested.)

Upvotes: 3

Views: 7677

Answers (7)

corlettk
corlettk

Reputation: 13574

A realise this is an old "deadish" thread, but I've got a solution not posited previously which I think is worth throwing in the ring.

We store "arbitrary" attributes (i.e. created by the user at runtime) of geographic features in a single CLOB column in the DB in the standard XML attributes format. That is:

name="value" name="value" name="value"

To create an XML element you just "wrap up" the attributes in an xml element. That is:

String xmlString += "<arbitraryAttributes" + arbitraryAttributesString + " />"

"Serialising" a Properties instance to an xml-attributes-string is a no-brainer... it's like ten lines of code. We're lucky in that we can impose on the users the rule that all attribute names must be valid xml-element-names; and we xml-escape (i.e. "e; etc) each "value" to avoid problems from double-quotes and whatever in the value strings.

It's effective, flexible, fast (enough) and simple.

Now, having said all that... if we had the time again, we'd just totally divorce ourselves from the whole "metadata problem" by storing the complete unadulterated uninterpreted metadata xml-document in a CLOB and use one of the open-source metadata editors to handle the whole mess.

Cheers. Keith.

Upvotes: 0

ShawnD
ShawnD

Reputation: 992

Check out the apache commons configuration package. This will allow you to read/save a file as XML or properties format. It also gives you an option of automatically saving the property changes to a file.

Apache Configuration

Upvotes: 0

cletus
cletus

Reputation: 625307

Why not just use the Properties class? That does exactly what you want.

Upvotes: 3

alepuzio
alepuzio

Reputation: 1418

As @DanVinton says, if you need this in internal use (I mean "

internal use

as

it's used only by my components, not components written by others

you can concate key and value. I prefer use different separator between key and key and key and value:
Instead of
key1+SEPARATOR+value1+SEPARATOR+key2 etc
I code
key1+SEPARATOR_KEY_AND_VALUE+value1+SEPARATOR_KEY(n)_AND_KEY(N+1)+key2 etc

if you must debug, this way is clearer (by design too)

Upvotes: 0

Dan Vinton
Dan Vinton

Reputation: 26769

As @Uri says, additional context would be good. I think your primary concerns are less about the particular encoding scheme, as rolling your own for most encodings is pretty easy for a simple Map<String, String>.

An interesting question is: what will this intermediate string encoding be used for?

  • if it's purely internal, an ad-hoc format is fine eg simple concatenation:

    key1|value1|key2|value2
    
  • if humans night read it, a format like Ruby's map declaration is nice:

    { first_key  => first_value, 
      second_key => second_value }
    
  • if the encoding is to send a serialised map over the wire to another application, the XML suggestion makes a lot of sense as it's standard-ish and reasonably self-documenting, at the cost of XML's verbosity.

    <map>
        <entry key='foo' value='bar'/>
        <entry key='this' value='that'/>
    </map>
    
  • if the map is going to be flushed to file and read back later by another Java application, @Cletus' suggestion of the Properties class is a good one, and has the additional benefit of being easy to open and inspect by human beings.


Edit: you've added the information that this is to store in a database column - is there a reason to use a single column, rather than three columns like so:

CREATE TABLE StringMaps 
(
    map_id NUMBER   NOT NULL,  -- ditch this if you only store one map...
    key    VARCHAR2 NOT NULL,
    value  VARCHAR2
);

As well as letting you store more semantically meaningful data, this moves the encoding/decoding into your data access layer more formally, and allows other database readers to easily see the data without having to understand any custom encoding scheme you might use. You can also easily query by key or value if you want to.


Edit again: you've said that it really does need to fit into a single column, in which case I'd either:

  • use the first pipe-separated encoding (or whatever exotic character you like, maybe some unprintable-in-English unicode character). Simplest thing that works. Or...

  • if you're using a database like Oracle that recognises XML as a real type (and so can give you XPath evaluations against it and so on) and need to be able to read the data well from the database layer, go with XML. Writing XML parsers for decoding is never fun, but shouldn't be too painful with such a simple schema.

Even if your database doesn't support XML natively, you can just throw it into any old character-like column-type...

Upvotes: 5

Rob Williams
Rob Williams

Reputation: 7921

I have been contemplating a similar need of choosing a common representation for the conversations (transport content) between my clients and servers via a facade pattern. I want a representation that is standardized, human-readable (brief), robust, fast. I want it to be lightweight to implement and run, easy to test, and easy to "wrap". Note that I have already eliminated XML by my definition, and by explicit intent.

By "wrap", I mean that I want to support other transport content representations such as XML, SOAP, possibly Java properties or Windows INI formats, comma-separated values (CSV) and that ilk, Google protocol buffers, custom binary formats, proprietary binary formats like Microsoft Excel workbooks, and whatever else may come along. I would implement these secondary representations using wrappers/decorators around the primary facade. Each of these secondary representations is desirable, especially to integrate with other systems in certain circumstances, but none of them is desirable as a primary representation due to various shortcomings (failure to meet one or more of my criteria listed above).

Therefore, so far, I am opting for the JSON format as my primary transport content representation. I intend to explore that option in detail in the near future.

Only in cases of extreme performance considerations would I skip translating the underlying conventional format. The advantages of a clean design include good performance (no wasted effort, ease of maintainability) for which a decent hardware selection should be the only necessary complement. When performance needs become extreme (e.g., processing forty thousand incoming data files totaling forty million transactions per day), then EVERYTHING has to be revisited anyway.

As a developer, DBA, architect, and more, I have built systems of practically every size and description. I am confident in my selection of criteria, and eagerly await confirmation of its suitability. Indeed, I hope to publish an implementation as open-source (but don't hold your breath quite yet).

Note that this design discussion ignores the transport medium (HTTP, SMTP, RMI, .Net Remoting, etc.), which is intentional. I find that it is much more effective to treat the transport medium and the transport content as completely separate design considerations, from each other and from the system in question. Indeed, my intent is to make these practically "pluggable".

Therefore, I encourage you to strongly consider JSON. Best wishes.

Upvotes: 1

Uri
Uri

Reputation: 89819

Some additional context for the question would help.

If you're going to be encoding and decoding at the entire-map granularity, why not just use XML?

Upvotes: 0

Related Questions