Reputation: 693
I'm learning UIMA, and I can create basic analysis engines and get results. But What I'm finding it difficult to understand is use of CAS consumers. At the same time I want to know how different it is from AnalysisEngine? From many examples I have seen, CAS consumer is not really needed(?). Is CAS consumer is very important from big applications point of view or can we do without it?
Upvotes: 2
Views: 380
Reputation: 10915
The main difference is that by default analysis engines are configured to allow being run in parallel so that they may see only some CASes each (OperationalProperties multipleDeploymentAllowed = true).
CAS consumers are configured to disallow being run in parallel, meaning that they will see all CASes (OperationalProperties multipleDeploymentAllowed = false).
The latter is necessary, e.g. when you want to write all results to a single file.
E.g. the CPE engine respects this flag. When configured for multi-threaded execution, CPE will keep multiple parallel instances of all analysis engines until it hits the first one in the pipeline with multipleDeploymentAllowed = false, which is usually a consumer. For all following components (analysis engines, consumers) only a single instance is created.
Disclosure: I'm on the Apache UIMA project.
Upvotes: 3
Reputation: 16521
You can totally do without it. Just use an analysis engine. BTW, are you using uimaFIT already?
Upvotes: 0
Reputation: 356
There is no difference between them in the current version. Historically, a CASConsumer would tipically not modify the CAS, but only use the data existing in the CAS (previously added by an Analysis Engine) to aggregate it/prepare it for use in other systems, e.g., ingestion in databases.
In the current version, it is recommended that CASConsumers be replaced by Analysis Engine components.
Upvotes: 1