Dimitri Lesnoff
Dimitri Lesnoff

Reputation: 382

Nim : How to constrain an existing type

I have a question about type definition.

I would like to restrict an existing type to enforce certain additional criterion. For example, I would like to construct a type for a DNA string.

A DNA strand can be seen as an arbitrary long string of characters that only contains the characters 'A', 'C', 'G', 'T' (Nucleotides). Similarly, I would define a RNA string as a string with only characters 'A', 'C', 'G', 'U' .

A RNA string can be decomposed into codons, which is a string with only three characters among the four nucleotides ('A', 'C', 'G', 'U'). Can I make a codon type, that would automatically check (e.g. at the initialisation or after a type conversion), whether the string is of length 3 and does not contain any other characters than the ones valid ?

I have attempted to use a concept type :

var
  NucleotideSet: set[char] = {'A','C','G','U'}

type
  Nucleotide {.explain.} = concept var a
    a is char
    a in {'A','C','G','U'}

  RnaCodon = seq[Nucleotide]

but this experimental feature does not coerce existing type, it only checks if a type verifies some properties, but I might be mistaken.

What I want to do is to manipulate RNA strings without having to check manually that each character is indeed a Nucleotide.

With the definitions in my code above, the following fails :

echo 'A' is Nucleotide

I get a type mismatch : ''A'' is char but expected Nucleotide. What I have done wrong, in this example and how could I fix it to define a RNAstring and a codon ? My guess now is that in the concept type, a is not the type but the variable and I would probably need to write something like :

type
  Nucleotide {.explain.} = concept var a, type T
    a is T
    T is char
    a in {'A','C','G','U'}

but I get also a type mismatch error.

Upvotes: 2

Views: 250

Answers (2)

Grzegorz Adam Hankiewicz
Grzegorz Adam Hankiewicz

Reputation: 7681

As far as you have explained the only problem you have is that you want to have a kind of variables were you are sure only a certain value is held. I'd use normal strings as a distinct type. The avoiding sql injection attacks section in the documentation explain how this could word:

  1. Create a distinct string for both a DNA and RNA strand.
  2. Create a validation/parsing function that converts any input to DNA or RNA distinct strings.
  3. For each input string, pass the input to those conversion functions. If they are valid DNA/RNA, your function returns the string converted to the distinct type, otherwise the input is discarded and/or an error is generated.
  4. From then on, other code uses only the distinct type, so you can't pass unvalidated strings there, and have confidence that the data sent to those procs has been validated.

When you are using the distinct types you don't even need to check of a certain element of the distinct type is a nucleotide or not, since your input validation/conversion proc already deals with that once.

Upvotes: 1

ynfle
ynfle

Reputation: 301

Concepts don't work for runtime (even run in the VM) "aspects". It is bound to each type (check out the last paragraph of this section of the experimental manual right before "concept diagnostics") For example, to only bind a seq of a certain length isn't possible because a seq of a different length isn't a different type. Same thing for characters.

You would have to do acrobatics and statically specify the character as a generic parameter in a user created type similar to the AnyMatrix concept in the first example here.

Upvotes: 1

Related Questions