Reputation: 153
I am trying the example of uima ruta: here.
I want to create ruta script and apply it to my text (from plain java without any workbench).
1.how do i get the type system descriptor from plain java (without workbench)? 2. when do i get it with workbench? (if i "run" the ruta script, no description were made.)
Upvotes: 1
Views: 1920
Reputation: 3113
The main question is whether the script declares new types.
If no new types are declared, the linked examples in the documentation should be sufficient.
If new types are declared in the script, then a type system description needs to be created and included in the creation process of the CAS before the script can be applied on the CAS.
The type system description of a script containing the type descriptions of the types declared within the script can be created the following ways:
There are several ways to create and execute a ruta-based analysis engine in plain java code. Here's an example without using additional files:
String rutaScript = "DECLARE MyType; CW{-> MyType};";
RutaDescriptorFactory descriptorFactory = new RutaDescriptorFactory();
RutaBuildOptions options = new RutaBuildOptions();
options.setResolveImports(true);
options.setImportByName(true);
RutaDescriptorInformation descriptorInformation = descriptorFactory
.parseDescriptorInformation(rutaScript, options);
// replace null values for build environment if necessary (e.g., location in classpath)
Pair<AnalysisEngineDescription, TypeSystemDescription> descriptions = descriptorFactory
.createDescriptions(null, null, descriptorInformation, options, null, null, null);
AnalysisEngineDescription rutaAnalysisEngineDescription = descriptions.getKey();
rutaAnalysisEngineDescription.getAnalysisEngineMetaData().getConfigurationParameterSettings().setParameterValue(RutaEngine.PARAM_RULES, rutaScript);
TypeSystemDescription rutaTypeSystemDescription = descriptions.getValue();
// directly set type system description since no file will be created
rutaAnalysisEngineDescription.getAnalysisEngineMetaData().setTypeSystem(rutaTypeSystemDescription);
ResourceManager resourceManager = UIMAFramework.newDefaultResourceManager();
AnalysisEngine ae = UIMAFramework.produceAnalysisEngine(rutaAnalysisEngineDescription);
List<TypeSystemDescription> typeSystemDescriptions = new ArrayList<>();
TypeSystemDescription scannedTypeSystemDescription = TypeSystemDescriptionFactory.createTypeSystemDescription();
typeSystemDescriptions.add(scannedTypeSystemDescription);
typeSystemDescriptions.add(rutaTypeSystemDescription);
TypeSystemDescription mergeTypeSystemDescription = CasCreationUtils.mergeTypeSystems(typeSystemDescriptions, resourceManager);
JCas jCas = JCasFactory.createJCas(mergeTypeSystemDescription);
CAS cas = jCas.getCas();
jCas.setDocumentText("This is my document.");
ae.process(jCas);
Collection<AnnotationFS> select = CasUtil.select(cas, cas.getTypeSystem().getType("Anonymous.MyType"));
for (AnnotationFS each : select) {
System.out.println(each.getCoveredText());
}
DISCLAIMER: I am a developer of UIMA Ruta
Upvotes: 3