Reputation: 547
I'm a beginner with Ruta and the idea I'm trying to grasp now is how to handle, within UIMA environment(in plain Java), the class variables/collections. I've tried following the examples given in the documentation; but the Ruta rules are applied either externally as a script file or right "on the spot" using Ruta.apply(cas, rule). Neither of these options allows me to use, for example, a file lexicon or any predifined java collections. Could you please give me any hints/solutions to my problem?
Generally, I'm using UIMA AE's to parse sentences and then, to use the created annotations within Ruta script for matching specific types of sentences based on their syntactical structure. Therefore, the Ruta rules I write are fairly simple but bulky because of the POStags set. So I would like to get some flexibility inside Ruta. I would be grateful if there are any suggestions on this topis as well.
EDIT: For example, I have a rule which considers a set of POSTags created by an AE (Stanford Parser). So in order to match the desired sentence structure I would hardcode it in the following way(I realize it's the most naive way):
String rutaSampleRule = "BLOCK(ForEach) Sentence{}{Document{-> Asyndeton} "
+ "<- {((Constituent.label==\"NN\" COMMA Constituent.label==\"NN\") |"
+ " (Constituent.label==\"NNP\" COMMA Constituent.label==\"NNP\") |"
+ " (Constituent.label==\"NNPS\" COMMA Constituent.label==\"NNPS\") |"
+ " (Constituent.label==\"NNS\" COMMA Constituent.label==\"NNS\"));};}";
Ruta.apply(cas, rutaSampleRule);
Now, what I would like to have instead is to declare a collection of such POStags (i.e. NNS, NN), iterate over it inside Ruta and match the respective sentence structure (here, consecutive nouns). This would make my rules much more flexible and practical.
The second option would be to use lexicons instead of collection but I thought they can be used(with MARKFAST) only within Ruta separately(not plain Java); at least I could not find any examples.
So, to summarize my question: Is it possible(and how if so), within simple Ruta scripts (which do not introduce any new types), to work with externally defined collections/lexicons in plain Java?
I hope, I managed to explain it in a better way. Thanks in advance.
EDIT 1: I figured it out how to use lexicons inside plain Java just by playing around with paths and the example in the guide book. Still, I would like to know how to assign the values to variables by using the configuration parameters?
Upvotes: 2
Views: 107
Reputation: 3113
This should do the trick (tested with current trunk):
String rutaSampleRule = "STRINGLIST posList;"
+ "Sentence{-> Asyndeton} <- {"
+ "c1:Constituent{CONTAINS(posList, c1.label)} COMMA c2:Constituent{c2.label == c1.label};"
+ "};";
List<String> posList = Arrays.asList(new String[] { "NN", "NNP", "NNPS", "NNS" });
Map<String, Object> additionalParams = new HashMap<>();
additionalParams.put(RutaEngine.PARAM_VAR_NAMES, new String[] { "posList" });
additionalParams.put(RutaEngine.PARAM_VAR_VALUES, new String[] { StringUtils.join(posList, ",") });
Ruta.apply(cas, rutaSampleRule, additionalParams);
Some comments:
DISCLAIMER: I am a developer of UIMA Ruta
Upvotes: 1