Can Haskell functions be serialized?

The best way to do it would be to get the representation of the function (if it can be recovered somehow). Binary serialization is preferred for efficiency reasons.

I think there is a way to do it in Clean, because it would be impossible to implement iTask, which relies on that tasks (and so functions) can be saved and continued when the server is running again.

This must be important for distributed haskell computations.

I'm not looking for parsing haskell code at runtime as described here: Serialization of functions in Haskell. I also need to serialize not just deserialize.

Upvotes: 28

Views: 3572

Answers (5)

anon
anon

Reputation:

A pretty simple and practical, but maybe not as elegant solution would be to (preferably have GHC automatically) compile each function into a separate module of machine-independent bytecode, serialize that bytecode whenever serialization of that function is required, and use the dynamic-loader or plugins packages, to dynamically load them, so even previously unknown functions can be used.

Since a module notes all its dependencies, those could then be (de)serialized and loaded too. In practice, serializing index numbers and attaching an indexed list of the bytecode blobs would probably be the most efficient.

I think as long as you compile the modules yourself, this is already possible right now.

As I said, it would not be very pretty though. Not to mention the generally huge security risk of de-serializing code from insecure sources to run in an unsecured environment. :-)
(No problem if it is trustworthy, of course.)

I’m not going to code it up right here, right now though. ;-)

Upvotes: 0

axm
axm

Reputation: 281

Eden probably comes closest and probably deserves a seperate answer: (De-)Serialization of unevaluated thunks is possible, see https://github.com/jberthold/packman.

Deserialization is however limited to the same program (where program is a "compilation result"). Since functions are serialized as code pointers, previously unknown functions cannot be deserialized.

Possible usage:

  • storing unevaluated work for later
  • distributing work (but no sharing of new code)

Upvotes: 2

Rob Stewart
Rob Stewart

Reputation: 1842

No. However, the CloudHaskell project is driving home the need for explicit closure serialization support in GHC. The closest thing CloudHaskell has to explicit closures is the distributed-static package. Another attempt is the HdpH closure representation. However, both use Template Haskell in the way Thomas describes below.

The limitation is a lack of static support in GHC, for which there is a currently unactioned GHC ticket. (Any takers?). There has been a discussion on the CloudHaskell mailing list about what static support should actually look like, but nothing has yet progressed as far as I know.

The closest anyone has come to a design and implementation is Jost Berthold, who has implemented function serialisation in Eden. See his IFL 2010 paper "Orthogonal Serialisation for Haskell". The serialisation support is baked in to the Eden runtime system. (Now available as separate library: packman. Not sure whether it can be used with GHC or needs a patched GHC as in the Eden fork...) Something similar would be needed for GHC. This is the serialisation support Eden, in the version forked from GHC 7.4:

data Serialized a = Serialized { packetSize :: Int , packetData :: ByteArray# }
serialize   :: a -> IO (Serialized a)
deserialize :: Serialized a -> IO a

So: one can serialize functions and data structures. There is a Binary instance for Serialized a, allowing you to checkpoint a long-running computation to file! (See Secion 4.1).

Support for such a simple serialization API in the GHC base libraries would surely be the Holy Grail for distributed Haskell programming. It would likely simplify the composability between the distributed Haskell flavours (CloudHaskell, MetaPar, HdpH, Eden and so on...)

Upvotes: 22

augustss
augustss

Reputation: 23014

Unfortunately, it's not possible with the current ghc runtime system. Serialization of functions, and other arbitrary data, requires some low level runtime support that the ghc implementors have been reluctant to add.

Serializing functions requires that you can serialize anything, since arbitrary data (evaluated and unevaluated) can be part of a function (e.g., a partial application).

Upvotes: 27

Ankur
Ankur

Reputation: 33657

Check out Cloud Haskell. It has a concept called Closure which is used to send code to be executed on remote nodes in a type safe manner.

Upvotes: 20

Related Questions