Suds
Suds

Reputation: 13

UIMA RUTA: Italics

Does anyone know how can I search for all words in a text that are italicized? And to extend that, search for specific words that are (or are not) italicized?

For example, given "I am certain that I am not mistaken", I'd like to extract certain, or extract all am's that are not italicized?

Upvotes: 0

Views: 46

Answers (1)

Peter Kluegl
Peter Kluegl

Reputation: 3113

Assuming that the formatting information is present in the CAS, e.g., by applying the HtmlAnnotator (in combination with HtmlConverter) provided by Ruta, the rules could look like (as indicated in a comment of the question):

I{-> MyType};
SW.ct=="am"{-PARTOF(I) -> MyType};

You maybe need to import the HtmlTypeSystem of Ruta.

DISCLAIMER: I am a developer of UIMA Ruta

Upvotes: 0

Related Questions