donatello
donatello

Reputation: 6255

How to serialize/deserialize objects sent over the network in Haskell?

I see that there are many ways to serialize/deserialize Haskell objects:

In my application, I want to setup a simple TCP client-server, where client may send serialized Haskell record objects. How does one decide between these serialization alternatives?

Additionally, when objects serialized into strings are sent over the network using Network.Socket, strings are returned. Is there a slightly higher level library, that works at the level of whole TCP messages? In other words, is there a way to avoid writing parsing code on the receive end that:

In my application, the objects are not expected to be too large (maybe about ~1MB max).

Upvotes: 6

Views: 1179

Answers (2)

danidiaz
danidiaz

Reputation: 27766

As for the second part of your question, two things are required:

  1. An incremental parser that doesn't need to have the whole document in memory to start parsing, and which can be fed with the partial chunks of data arriving from the wire. Also, when the parsing succeeds it must return any "leftover data" along with the parsed value.

  2. A source of data with "pushback capabilities", that allows you to "unread" any leftovers so that they are available to the next parsing attempt.

The most popular library providing (1) is attoparsec. As for (2), all the three main streaming libraries (conduit, io-streams, and pipes) offer some kind of pushback functionality (the latter using the auxiliary pipes-parse package). All three libraries can integrate with attoparsec parsers as well (see here, here and here).

(Another option, of course, is to prepend each message with its lenght are read only the exact number of bytes.)

Upvotes: 1

Andrew Thaddeus Martin
Andrew Thaddeus Martin

Reputation: 3285

To answer the first part of your question (about data serialization), I would say that everything you listed sounds fine. Since you are dealing with pretty big (1MB) serializations, I think that the most important thing is laziness. There is another serialization library, called cereal that has strict serializations, and you wouldn't want that because you'd need to build it up in memory before sending in out. I'll give a shout out to aeson (http://hackage.haskell.org/package/aeson-0.8.0.2/docs/Data-Aeson.html) which you can use GHC Generics with to get something simple like this:

data Shape = Rect Int Int | Circle Double | Other String Int
  deriving (Generic)
instance FromJSON Shape  -- uses a default
instance ToJSON Shape    -- uses a default

And then, bam!, you've got access to the encode and decode methods. I don't know about a higher level TCP library. Hopefully, someone else will have more insight on that.

Upvotes: 1

Related Questions