Telavian
Telavian

Reputation: 3832

BinaryFormatter very slow with double.NaN

I am using the standard BinaryFormatter to serialize a very large object graph with some third-party objects in the mix. I tried others like Protobuf/JSON/XML and for one reason or another they all failed. The data is essentially the results of a complicated AI algorithm and has a large number of doubles in a heavily nested tree. Many of which could be NaN.

It seems that when a double is NaN then the BinaryFormatter fails silently and deals with it. It would be nice if it handled it correctly.

The core issue is described in this link.

Is there a work around so I can directly deal with NaN? I can serialize things directly however that could be a lot of work.

Edit:

In one of the heavy offenders which is a Naive Bayes implementation the code is:

public double[][][] Distributions { get; private set; }
public double[] Priors { get; private set; }

Upvotes: 0

Views: 213

Answers (1)

Steve Cooper
Steve Cooper

Reputation: 21470

All that springs to mind is this.

When you deserialize, it works on a stream. A stream is just a processor for reading bytes, and you can write a stream which re-writes another. Conceptually;

public NanToInfStreamReader: IStream
{
     NanToInfStreamReader(IStream source) 
     {
         ...
     }

     byte[] Read()
     {
         return ProtectAgainstNaN(source.Read());
     }
}

So the first part is to write a decorating stream like this, and search for any occurence of the 64 bits that represent Double.NaN. Substitute them in your stream for Double.Inf, say.

The BinaryFormatter will never see Double.NaN, and the speed issue won't occur.

However, now your data is filled with +Inf anywhere you had NaN, so you have to go back through your arrays and rewrite them.

It's not a great approach. But it sounds like you might be a bit stuck, and it's about all I can suggest.

Upvotes: 1

Related Questions