Reputation: 837
I am working on a Ruta script to identify currency, but I am having troubles with special characters like dollar sign ($).
I tried with simple character:
W{REGEXP("(dollar|nzd|$)") -> MARK(EntityType)};
an escaping it:
PACKAGE uima.ruta.example;
W{REGEXP("(dollar|nzd|\$)") -> MARK(EntityType)};
In the first case my pattern is not recognized, in the second case my editor gives me an error.
What's is the correct way to identify special characters?
Cheers.
Upvotes: 1
Views: 206
Reputation: 547
In UIMA Ruta, the special characters are part of the default seed annotation SPECIAL
. Your rule matches only on word tokens W
; therefore it won't fire.
In case you want to match only on $ as special character, then you could limit the SPECIAL
annotation with an REGEXP
condition as you do for W
:
// I spent $100.
SPECIAL{REGEXP("\\$"} -> Currency} NUM{-> Amount};
Let me know if this helps.
Upvotes: 2