Reputation: 67
I have an issue with Sentence Splitter module in GATE. My text is something like this:
Social history. He drank a lot in his young age. He did
not attend a school. He was depressed of his condition.
While we are sure that the sentences should be splitted like
Sentence 1: Social history.
Sentence 2: He drank a lot in his young age.
Sentence 3: He did not attend a school.
Sentence 4: He was depressed of his condition.
The ANNIE Sentence Splitter recognises that the text in different lines should be grouped in different sentences, thus results this:
Sentence 1: Social history.
Sentence 2: He drank a lot in his young age.
Sentence 3: He did
Sentence 4: not attend a school.
Sentence 5: He was depressed of his condition.
That is because the sentence is separated in multiple lines. Is there a way to tell the sentence splitter that the sentence might be comes in more than one line? Or is there any better method to recognise sentences in such type of text?
Thank you :)
Upvotes: 2
Views: 311
Reputation: 1683
Try using RegEx Sentence Splitter instead of Annie.
With the ANNIE Sentence Splitter, you have the parameter TransducerURL which by default points to something like:
/PATH-TO-GATE/plugins/ANNIE/resources/sentenceSplitter/grammar/main-single-nl.jape
In this folder there is also a jape file called:
/PATH-TO-GATE/plugins/ANNIE/resources/sentenceSplitter/grammar/main.jape
If you change it it should work.
Upvotes: 6