Reputation: 131162

What are the deficiencies of the built-in BinaryFormatter based .Net serialization?

What are the deficiencies of the built-in BinaryFormatter based .Net serialization? (Performance, flexibility, restrictions)

Please accompany your answer with some code if possible.

Example:

Custom objects being serialized must be decorated with the [Serializable] attribute or implement the ISerializable interface.

Less obvious example:

Anonymous types can not be serialized.

Upvotes: 18

Answers (10)

Marc Gravell

Reputation: 1064104

If you mean BinaryFormatter:

being based on fields, is very version intolerant; change private implementation details and it breaks (even just changing it to an automatically implemented property)
isn't cross-compatible with other platforms
isn't very friendly towards new fields
is assembly specific (metadata is burnt in)
is MS/.NET specific (and possibly .NET version specific)
isn't obfuscation-safe
isn't especially fast, or small output
doesn't work on light frameworks (CF?/Silverlight)
has a depressing habit of pulling in things you didn't expect (usually via events)

I've spent lots of time in this area, including writing a (free) implementation of Google's "protocol buffers" serialization API for .NET; protobuf-net

This is:

smaller output and faster
cross-compatible with other implementations
extensible
contract-based
obfuscation safe
assembly independent
is an open documented standard
works on all versions of .NET (caveat: not tested on Micro Framework)
has hooks to plug into ISerializable (for remoting etc) and WCF

Upvotes: 24

sean.net

Reputation: 759

Another situation causes the BinaryFormatter to throw an exception.

[Serializable]
class SerializeMe
{
    public List<Data> _dataList;
    public string _name;
}

[Serializable]
class Data
{
    public int _t;
}

Imagine SerializeMe gets serialized today. Tomorrow we decide we no longer need class Data and remove it. Accordingly, we modify the SerializeMe class to remove the List. It is now impossible to deserialize the old version of a SerializeMe object.

Solution is to either create a custom BinaryFormatter to properly ignore extra classes, or keep class Data with an empty definition (no need to keep the List member).

Upvotes: 0

James Garner

Reputation: 11

I concur with the last answer. The performance is pretty poor. Recently, my team of coders finished converting a simulation from standard C++ to C++/CLI. Under C++ we had a hand written persistance mechanism, which worked reasonably well. We decided to use the serialization mechanism, as opposed to re-writing the old persistance mechanism.
THe old simulation with a memory footprint between 1/2 and 1 Gig and most objects having pointers to other objects, and 1000's of objects at runtime, would persist to a binary file of about 10 to 15 Meg in under a minute. Restoring from the file was comparable.
Using the same data-files (running side-by-side) the running performance of the C++/CLI is about twice the C++, until we do the persistance (serialization in the new version) Writng out takes between 3 and 5 minutes, reading in takes between 10 and 20. The file size of the serialized files is about 5 times the size as the old files, Basically we see a 19 fold increase in the read time, and a 5 fold increas in the write time. This is unacceptable and we are looking for ways to correct this.

In examining the binary files I discovered a few things: 1. The type and assembly data is written in clear text for all types. This is space-wise inefficient. 2. Every object /instance of every type has the bloated type/assembly information writen out. One thing that we did in our hand persistance mechansim was write out a known type table. As we discovered types in writing, we looked up its existance in this table. If it did not exist, an entry was created wiht the type info, and an index assigned. Then we passed the type infor as an integer. (type,data,type,data) This 'trick' would cut down on the size tremendously. This may require going through the data twice, however an 'on-the-fly' process could be developped, where-by in addition to adding it to the table, pushing to the stream, if we could guarentee order of resotration from the stream.

I was hoping to re-implement some of the core serialization to optimize it this way, but, alas the classes are sealed! We may yet find a way to jerry-rig it.

Upvotes: 1

Sam Saffron

Reputation: 131162

A slightly less obvious one is that performance is pretty poor for Object serialization.

Example

Time to serialize and deserialize 100,000 objects on my machine:

Time Elapsed 3 ms
Full Serialization Cycle: BinaryFormatter Int[100000]

Time Elapsed 1246 ms
Full Serialization Cycle: BinaryFormatter NumberObject[100000]

Time Elapsed 54 ms
Full Serialization Cycle: Manual NumberObject[100000]

In this simple example serializing an object with a single Int field takes 20x slower than doing it by hand. Granted, there is some type information in the serialized stream. But that hardly accounts for the 20X slowdown.

Upvotes: 0

RvdK

Reputation: 19800

Types being serialized must be decorated with the [Serializable] attribute.

If you mean variables in a class, you are wrong. Public variables/properties are automaticly serialized

Upvotes: 0

Joel Coehoorn

Reputation: 416131

Another issue that came to mind:

The XmlSerializer classes are located in a completely different place from the generic run time formatters. And while they are very similar to use, the XmlSerializer does not implement the IFormatter interface. You can't have code that allows you to simply swap the serialization formatter in or out at run time between BinaryFormatter, XmlSerializer, or a custom formatter without jumping through some extra hoops.

Upvotes: 1

Jhonny D. Cano -Leftware-

Reputation: 18013

It isn't guaranteed you can serialize objects back and forth between different Frameworks (Say 1.0, 1.1, 3.5) or even different CLR Implementations (Mono), again, XML is better to this purpose.

Upvotes: 1

Jeff Kotula

Reputation: 2134

Versioning of data is handled through attributes. If you aren't worried about versioning then this is no problem. If you are, it is a huge problem.

The trouble with the attribute scheme is that it works pretty slick for many trivial cases (such as adding a new property) but breaks down pretty rapidly when you try to do something like replace two enum values with a different, new enum value (or any number of common scenarios that comes with long-lived persistent data).

I could go into lots of details describing the troubles. In the end, writing your own serializer is pretty darn easy if you need to...

Upvotes: 2

RossFabricant

Reputation: 12502

If you change the object you're serializing, all the old data you've serialized and stored is broken. If you stored in a database or even XML it is easier to convert old data to new.

Upvotes: 1

Joel Coehoorn

Reputation: 416131

Given any random object, it's very difficult to prove whether it really is serializable.

Upvotes: 3

What are the deficiencies of the built-in BinaryFormatter based .Net serialization?

Answers (10)

Related Questions