Reputation: 9805
For a statistical model, if I want to use units of measure consistently, I need to encode somewhere the number of data I have.
type DataSet<[<Measure>] 'x, [<Measure>] 'y, [<Measure>] 'n> =
DataSet of Matrixu<'n,'x> * Vectoru<'n,'y>
I can then define a function that merges 2 Dataset
static member (+) (a:DataSet<'x,'y,'n1>,b:DataSet<'x,'y,'n2>):DataSet<'x,'y,n> = ...
But I want to be able to merge unknonw number of dataset of different sizes. How do I define such a function ?
The naive ways fails because a sequence is only of 1 kind, so I'd need to statically define the size.
static member merge (ar:DataSet<'x,'y, ??? > seq) : DataSet<'x,'y, 'n> =
if seq.empty, etc...
let head = ar |> Seq.head
let others = ar |> Seq.skip 1
others |> Seq.fold (fun st el -> st + el) head
Or should I just add specific rules and not take care of the size of data. That means adding custom dimension management in applicative code, which is dirty as well.. and kind of ruins the point of having clean dimensions in the first place !
Upvotes: 1
Views: 144
Reputation: 2380
I think you are stretching the usage of units of measure. In my experience they work great for physics and similar (finance, I guess), but that's it. I don't believe they can handle sizes of static containers (eg n-by-m matrices).
Have you heard about dependent type systems and F*? I've seen examples with fixed-size lists which indicate it might fit the job. It's still a research project at Microsoft, might not be suitable for commercial use.
Upvotes: 1