nicolas
nicolas

Reputation: 9805

Units of measures and statistical models with F#

For a statistical model, if I want to use units of measure consistently, I need to encode somewhere the number of data I have.

type DataSet<[<Measure>] 'x, [<Measure>] 'y, [<Measure>] 'n>  = 
        DataSet of Matrixu<'n,'x> * Vectoru<'n,'y>

I can then define a function that merges 2 Dataset

static member (+) (a:DataSet<'x,'y,'n1>,b:DataSet<'x,'y,'n2>):DataSet<'x,'y,n>  = ...

But I want to be able to merge unknonw number of dataset of different sizes. How do I define such a function ?

The naive ways fails because a sequence is only of 1 kind, so I'd need to statically define the size.

    static member merge (ar:DataSet<'x,'y, ??? > seq) : DataSet<'x,'y, 'n>  = 
        if seq.empty, etc...
        let   head = ar |> Seq.head
        let others = ar |> Seq.skip 1
        others |> Seq.fold (fun st el -> st + el) head

Or should I just add specific rules and not take care of the size of data. That means adding custom dimension management in applicative code, which is dirty as well.. and kind of ruins the point of having clean dimensions in the first place !

Upvotes: 1

Views: 144

Answers (1)

Joh
Joh

Reputation: 2380

I think you are stretching the usage of units of measure. In my experience they work great for physics and similar (finance, I guess), but that's it. I don't believe they can handle sizes of static containers (eg n-by-m matrices).

Have you heard about dependent type systems and F*? I've seen examples with fixed-size lists which indicate it might fit the job. It's still a research project at Microsoft, might not be suitable for commercial use.

Upvotes: 1

Related Questions