Reputation: 13
Does anyone know how can I search for all words in a text that are italicized? And to extend that, search for specific words that are (or are not) italicized?
For example, given "I am certain that I am not mistaken", I'd like to extract certain
, or extract all am
's that are not italicized?
Upvotes: 0
Views: 46
Reputation: 3113
Assuming that the formatting information is present in the CAS, e.g., by applying the HtmlAnnotator (in combination with HtmlConverter) provided by Ruta, the rules could look like (as indicated in a comment of the question):
I{-> MyType};
SW.ct=="am"{-PARTOF(I) -> MyType};
You maybe need to import the HtmlTypeSystem of Ruta.
DISCLAIMER: I am a developer of UIMA Ruta
Upvotes: 0