Reputation: 45
I have a WORDTABLE containing numbers expressed as strings (zero, one, two, ..., n) plus the respective digits as features. I am trying to annotate a sequence of a fixed length of stringified numbers.
E.g.:
one two three four -> should be annotated
one two three four five six -> should not be annotated
So far I have done
WORDTABLE numbers = "numbers.csv";
DECLARE Annotation number(STRING int_string, STRING digit);
DECLARE Annotation numberSequence;
Document{-> MARKTABLE(number, 1, numbers, "digit" = 2)};
(number number) {-> MARK(numberSequence)};
This matches a sequence containing n stringified number, what I want is establishing the length of the sequence, something like:
number[4,4] {-> MARK(numberSequence)};
where the minimum and maximum tokens in the sentence containing the stringified numbers should be equal, for example, to 4. Is it possible to do this?
Upvotes: 1
Views: 202
Reputation: 3113
Here's an exemplary rule for annotating text positions if there are exactly four annotations of the type number
:
ANY{-PARTOF(number)} @number[4,4] {-> MARK(numberSequence)} ANY{-PARTOF(number)};
DISCLAIMER: I am a developer of UIMA Ruta
Upvotes: 1