gxvigo
gxvigo

Reputation: 837

Ruta escaping special characters

I am working on a Ruta script to identify currency, but I am having troubles with special characters like dollar sign ($).

I tried with simple character:

W{REGEXP("(dollar|nzd|$)") -> MARK(EntityType)};

an escaping it:

PACKAGE uima.ruta.example;

W{REGEXP("(dollar|nzd|\$)") -> MARK(EntityType)};

In the first case my pattern is not recognized, in the second case my editor gives me an error.

What's is the correct way to identify special characters?

Cheers.

Upvotes: 1

Views: 206

Answers (1)

Viorel Morari
Viorel Morari

Reputation: 547

In UIMA Ruta, the special characters are part of the default seed annotation SPECIAL. Your rule matches only on word tokens W; therefore it won't fire.

In case you want to match only on $ as special character, then you could limit the SPECIAL annotation with an REGEXP condition as you do for W:

// I spent $100.
SPECIAL{REGEXP("\\$"} -> Currency} NUM{-> Amount};

Let me know if this helps.

Upvotes: 2

Related Questions