Reputation: 566
I'm working on a math word problem solver, and would like to pass whole problems to my GATE Embedded application using JAPE. I'm using GATE IDE to display the output, as well as run the pipeline of GATE components. Each problem will be in its own paragraph, and each document will have several problems on it.
Is there a way to match any paragraph using the JAPE left-hand side regex?
Upvotes: 1
Views: 397
Reputation: 2984
why not just use RegEx Sentence splitter PR to use Split
as the Input in your jape rules?
Upvotes: 0
Reputation: 494
I see three options here (there may be more elegant solutions):
1) Use simple rule like:
Phase: find
Input: Token
Options: control = once
Rule:OneToken
(
{Token}
)
In RHS you could get a text and use standard Java approach for getting paragraphs from plain text.
2) Use LHS (if you really want only LHS)
Rule: NewLine
(
({SpaceToken.string=="\n"}) |
({SpaceToken.string=="\r"}) |
({SpaceToken.string=="\n"}{SpaceToken.string=="\r"}) |
({SpaceToken.string=="\r"}{SpaceToken.string=="\n"})
):left
Build annotation NewLine, then write a Jape rule similar to 1) but with NewLine instead of Token. Take all NewLines from outputAS and build your Paragraph annotations.
3) Sometimes there may be right paragraphs in Original markups. In this case you could use Annotation Set Transfer PR and get them in Default Annotations Set.
Upvotes: 2