What data structure to pick for genericity but safety?

Question

Say i have a long data structure definition

data A = A {
  x1 :: String
, x2 :: String
...
, x50 :: String
}

Now i have 3 tasks:

create a draft instance of A like A { x1 = "this is x1", ... }
create an instance of A from some other data structure
create another data instance from an instance of A

The three tasks involve the tediuous copying of the lables x1, ..., x50. A better solution would be a generic list

[
  Foo "x1" aValue1
, Foo "x2" aValue2
...
]

because it would make traversal and creating a draft much easier (the list definition is the draft already). The downside is that mapping other data structures to and from this would be more dangerous, since you lose static type checking.

Does this make sense? Is there a generic but safe solution?

Edit: To give you a better idea, it's about mapping business data to textual representation like forms and letters. E.g.:

data TaxData = TaxData {
  taxId :: String
, income :: Money
, taxPayed :: Money,
, isMarried :: Bool
...
}


data TaxFormA = TaxFormA {
  taxId :: Text
, isMarried :: Text
  ...
}
data TaxFormB = TaxFormB {
  taxId :: Text
, taxPayedRounded :: Text
...
}

Those get transformed into a stream of text, representing the actual forms. If i would create a form from tax data in one pass and next year any form field would have moved, there would e.g. be a stray "0.0" and i would not know where it belongs. That's what the intermediate datat strcuture is for: it makes it easy to create draft data.

So i need to map the actual TaxData to those intermediate form data; i need to map those form data to the actual form textual representation; i need to create draft intermediate form data. On one hand i hate repeating those data labels, on the other hand it gives me saftey, that i don't confuse any label while mapping. Is there a silver bullet?

Don Stewart · Accepted Answer

Deeply structured data like this is most idiomatically expressed in Haskell as nested, algebraic data types, as you have done. Why? It gives the most type structure and safety to the data, preventing functions from putting the data into the wrong format. Further safety can be gained by newtyping some of the types, to increase the differences between data in each field.

However, very large ADTs like this can be unwieldy to name and manipulate. A common situation in compiler design is specifying such a large ADT, for example, and to help write the code for a compiler we tend to use a lot of generic programming tricks: SYB, meta-programming, even Template Haskell, to generate all the boilerplate we need.

So, in summary, I'd keep the ADT approach you are taking, but look at using generics (e.g. SYB or Template Haskell) to generate some of your definitions and helper functions.

What data structure to pick for genericity but safety?

Answers (1)

Related Questions